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Selected Factors Which Influence Job Preferences 


Clifford E. Jurgensen 
Minneapolis Gas Light Company 


Employee motivation is generally assumed to have an important 
effect on personnel relations, production indices, employee turnover, 
and other such factors which play an important part in determining the 
over all well-being of any business or industrial concern. The significant 
motivational factors in the employee situation have not been adequately 
determined and agreed upon by workers in the field, and will probably 
continue to present difficulties because the potency of many of those 
factors is apt to be largely unconscious. Indirect evidence can be 
obtained from employee morale studies and from statements of job ap- 
plicants and employees with regard to the relative importance of factors 
which are commonly assumed to be significant determiners of job satis- 
faction. Although it may be contended that such statements do not get 
to the root of the motivational problem because they are limited to con- 
scious beliefs, such statements are neverthless important in the industrial 
and business situation. What employees believe to be irué is frequently of 
greater importance ihan what ts actually true. 

The lack of experimental data on the importance of factors influencing 
job satisfaction is particularly surprising in view of the diversity of opin- 
ions held by psychologists, personnel men, operating executives, union 
leaders, and others. Solution of industrial conflict is frequently stale- 
mated because management and union representatives are unable to 
agree on what is desired by employees. Each such representative, of 
course, believes he has ample “proof” that his opinion is correct, and 
vigorously denies that his opinion is influenced by his own personal de- 
sires or by a minority group of employees who are excessively verbal. 

In spite of the great need for studies of employee wants or preferences, 
few have been made. One such study was reported by Chant (4) in 1932. 
His results have not gained wholehearted support from management and 
union leaders because oi the relatively small number of employees in- 
cluded in the study and because his miscellaneous group was composed of 
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persons belonging to the Young Men’s Department of a YMCA, which 
group might or might not be typical of workers in general. At the present 
time further criticism is directed to the study on the basis that employee 
preferences are not the same today as they were fifteen years ago when 
that study was made. Chant’s resulis are given in the first two columns 
of Table 1. The original study utilized the paired comparison techniques 
and a scale value was assigned each factor. Only the rank order is given 
here. 
Table 1 
Rank Order of Job Preference Factors as Obtained by Various Investigators 











Source: Chant Wyatt&  Berdie Blum and 
Langdon Russ 

Group: Mise. Dept. Women Male Males Fe- 

Workers Store Factory H. 8. males 

Workers Workers Graduates 





Number: 150 100 325 150 181 105 

Opportunity for Advance- 1 1 5 2 1 1 

ment 
Steady Work 2 2 1 1 3:2 
Opportunity to Use Your 3 3 7 4 

Ideas 
Opportunity to Learn a Job 4 4 8 7 
Opportunity to be of Public 5 7 8 

Service 
Good Boss 6 5 4 9 4 3 
High Pay 7 6 6 3 3 4 
Good Working Companions 8 8 3 5 
Comfortable Working Con- 9 9 2 ll 

ditions 
Clean Work 10 1l 10 
Good Hours 11 10 9 6 5 5 
Easy Work 12 12 10 12 





Wyatt and Langdon (7) conducted a study in 1937 which was com- 
parable to that made earlier by Chant. Inasmuch as their data were 
obtained in England, applicability to the American situation is question- 
able. Their data are given in the third column of Table 1. 

In 1942, Blum and Russ (3) reported a study of employee attitudes 
based on use of printed questionnaires distributed in a personal inter- 
view to 286 persons residing in the New York City Area. Five items were 
presented in paired comparison form, each respondent being requested to 
give his preference within each of the pairs. Results were summarized in 
terms of per cent of times each item was checked as the preferreditem. A 
summary of their results (in terms of rank order) for males and females is 
given in the last two columns of Table 1. Like the earlier studies, that of 
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Blum and Russ was objected to on the basis of the relatively small number 
of individuals involved. It has the further disadvantage of being limited 
to only five factors. 

Berdie (2) reported a study based on 150 male high school graduates 
who came to the Testing Bureau of the University of Minnesota in the 
spring and summer of 1940. He asked these persons to rank eighteen 
factors in order of importance. Twelve of these factors were worded 
sufficiently similar (if not identical) to the terminology used by Chant to 
permit comparison of the studies. Results.on these overlapping factors 
are given in Table 1. Berdie found that his group was representative 
(insofar as entrance examinations were concerned) of the tutal freshman 
male class entering the University in the fall quarter; therefore, the group 


may or may not have given results comparable to high school graduates 
seeking employment. 


Description of Questionnaire Used 


The study reported here was made in order to obtain data on the re- 
lative importance, among job applicants, of various factors which are 
frequently mentioned as having an important bearing on job satisfaction. 
Ten factors were selected for inclusion in a questionnaire form. Because 
all of these factors were considered important, it appeared inadvisable 
to use a rating scale technique. It is likely that most, if not all, of the 
factors would have been given the highest possible rating with the conse- 
quent result that little differentiation would have been obtained. The 
primary purpose was to determine the relative rather than absolute im- 
portance of each factor. 

The paired comparison technique has one advantage not shared by 
other techniques; namely, that the respondent can be asked to make a 
choice in such a way that selection of one factor specifically decreases the 
opportunity to benefit from another related factor. This was done by 
Blum and Russ who, for example, asked individuals to choose between: 


Receive more pay and have an unfriendly supervisor, or 
Receive less pay and have a friendly supervisor. 


Paired comparisons do not solve all problems, however, inasmuch as 
the respondent may still wonder how much more or less pay would be in- 
volved or how friendly or unfriendly the supervisor might be. Another 
disadvantage of the paired comparison technique so far as the present 
study is concerned resides in the inclusion of ten factors. In order that 
all possible comparisons could be made, forty-five pairs would have to be 
presented. This would not only be time consuming to the applicant, but 
might frequently be confusing because of the duplication of each item on 
nine different occasions. 
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A third possible technique is that of rank order, wherein each respond- 
ent is asked to rank each item in order of importance to him. This can 
be done rather quickly, and although there are numerous theoretical 
differences between the paired comparison and rank order techniques, 
other investigators have found that results tend to be equally valid (1) 
and to give almost identical results (5). 

The rank order technique carries with it the rather serious disadvan- 
tage that differences between adjacent ranks do not necessarily indicate 
differences of equal magnitude (6). This difficulty is frequently overcome 
by converting ranks to linear scores, but involves the assumption of 
normality of the distribution, which assumption cannot be adequately 
defended in a study of this type involving only ten factors which were 
arbitrarily selected. On the other hand, treating ranks as linear scale 
units results in a certain degree of inaccuracy when applying various 
statistical procedures. The rank order method was selected in spite of 
its limitations because, in this particular case, its assets appeared to out- 
weigh its limitations. 

The ten selected factors were printed in questionnaire form following 
suitable instructions developed with trial samples of applicants. Each 
factor was briefly defined. Semantic difficulties were recognized. For 
example ‘‘advancement (opportunity for promotion)’’ may be interpreted 
as advancement on the basis of individual merit regardless of seniority, or 
sufficient job opportunities on a higher level when, as, and if the requisite 
seniority has been obtained. Difficulties of this type were recognized 
from the beginning, but it was considered inadvisable to arouse emotional 
feelings on the matter of seniority versus merit. Direct expression of 
terms such as merit or seniority might have resulted in persons responding 
in a manner dissimilar to their actual preference because of intense feelings 
for or against unions. The extent to which such errors were avoided by 
using more generalized or innocuous terms is unknown. 

A copy of the questionnaire used is given in Figure 1. ‘Individuals 
were not required to sign their names, although they were requested to 
give other personal information desired for various analyses. These data 
were obtained by means of a check list which appeared on the reverse 
side of the questionnaire proper. (See Figure 2.) 


Description of Population 


All applicants of the Minneapolis Gas Light Company between Sep- 
tember 1, 1945 and August 31, 1946, were requested to fill in a copy of the 
job preference questionnaire. A total of 1360 applicants filled in this 
questionnaire. Twenty-one cases were discarded for failure to follow 
instructions (e.g., ranking all items as 1). The usable questionnaires 
consisted of 150 answered by women and 1189 by men. 
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JOB PREFERENCES 
(What makes a job good or bad?) 
Decide which of the following is most important to you and place a 1 on the line in 
front of it. Then decide which is second in importance to you and place a 2 in front 


of it. Continue listing the items in order of importance to you till the least important 
is ranked 10. 


All the items are important, but people differ in the order in which they rank them. 
There are no right or wrong answers. Answer according to how you think, not how 
you believe others think. 


ADVANCEMENT (Opportunity for promotion) 

BENEFITS (Vacation, sick pay, insurance, etc.) 

COMPANY (Employment by company you are proud to work for) 

---eeeseeeeee CO-WORKERS (Fellow workers who are pleasant, agreeable, and good work- 

ing companions) 

HOURS (Good starting and quitting time, good number of hours per day or 
week, day or night work, etc.) 

PAY (Large income during year) 

SECURITY (Steady work, no lay-offs, sureness of being able to keep your 
job) 

SUPERVISOR (A good boss who is considerate and fair) 

TYPE OF WORK (Work which is interesting and well liked by you) 

WORKING CONDITIONS (Comfortable and clean; absence of noise, heat, 
cold, odors, etc.) 


BE SURE TO FILL IN OTHER SIDE 
Fie. 1. Copy of questionnaire. 


Because of the small number of women included, no attempt was made 
to determine differences accompanying changes in their age, marital 
status, education, etc. Questionnaires filled in by men were analyzed 
according to personal data as given in Figure 2 except for monthly salary. 
Many former servicemen gave their army or navy pay as their previous 
salary, and others listed their former civilian salary. Among those who 
had remained civilians, some had worked in highly paid temporary war 
industry positions and others had remained in lower paid but more secure 
non-war industries. Because of lack of comparability among these 
groups, no analysis of job preferences was made on the basis of previous 
salary. 

Although the total group of men included in this study was 1189, the 
analyses for various subgroups contain fewer individuals because of 
failure of some applicants to supply all the requested personal data. 

The study reported here must be recognized as limited to job appli- 
cants within a single company in a single midwestern city. It appears 
unlikely that any major differences would be found among job applicants 
and employees except in extreme cases. It is less certain, however, that 
no selective process functions with regard to the company or the city. 
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The following data about yourself are desired for research purposes: 


Clifford E. Jurgensen 











1. Sex 5. Monthly salary (if unemployed give most recent 
—e Male salary) _ 
acniuitie Female --------- Less than $100 
aiaiaadesas $100-$149 
a eee $150-$199 
ae Single sannsneeeeee $200-—$249 
social Married cacesaee-ene $250-—$299 
incited Widowed coseeeeeeee $300-$349 
nintebetglite Divorced soseseaseeee $300-$399 
senhional Separated sase-eee-e $400-$449 
cui $450-$499 
3. Dependents (besides yourself) $500 or over 
bLaatiiitie None 
eis One 6. Education 
po aes Two -...-.-- Sth grade or less 
obiiiean Three _...... Some high school, but not completed 
sciiamaiias Four -..-.---.- High school or vocational school diploma 
coe Five ......-«...- Diploma plus technical or business school 
Heli More than five -..s. SOME college, but not completed 
TRA College or University degree 
7}; Deere S| PR ce ee Advanced University degree 
thinckspaa Under 20 
nips 20-24 7. Main occupation 
idiot 25-29 
estltidia 30-34 
beaut 35-39 
maida 40-44 
Sitti 45-49 
sitisehsiliben 50-54 
noha 55-59 
oalabanst 60 or over ' 


Fic. 2. Personal data requested by questionnaire. 


It is pussible, though no evidence exists for or against such possibility, 
that applicants for a utility (the Minneapolis Gas Light Company, in this 
case) differ from those of another type of organization. It is also possible 
that applicants in Minneapolis differ from those of other communities, 
though, again, no such evidence exists. In spite of these possibilities, it 
is the unsupported opinion of the author that the results obtained are 
typical of those which would be obtained in other companies and in other 
cities. 
Discussion of Results 


Summary results are given separately for men and women in Table 2, 
and more detailed results for various groups of men are given in Table 3. 
Inasmuch as these tables are based on the mean rank assigned each factor 
by the applicant group, low means are indicative of high preference value 
and high means are indicative of low preference values. 








Selected Factors Which Influence Job Preferences 559 


The argument can be raised that factors are not mutually exclusive 
(for example, advancement is usually accompanied by an increase in pay) 
and that the broad definitions increase these inter-relations. That the 
factors were interpreted with sufficient indepe=cence to fulfill the purpose 
of this study is evidenced by the large range of obtained means. 

Results obtained from applicants ave significant with respect to 
similarities as well as differences between the various groups. Tables 2 
and 3 indicate the absense in many cases of large or important differences 
which might be expected to occur with changes such as marital status, 
dependents, age, etc. Many of the differences (as small as .4) are statis- 
tically significant at the one percent level, though it is questionable 
whether such differences are of any practical importance. 

No complete discussion of results will be attempted. Details can be 
cbtained by the interested reader from careful inspection of Tables 
2 and 3. 














Table 2 
Job Preferences of Men and Women Applicants 
Mean Rank 
Level 

Sex: Men Women of 

Number: 1189 150 Significance 
Security 3.1 4.7 01 
Advancement 3.4 4.5 01 
Type of Work 3.7 2.8 01 
Company 4.6 5.0 10 
Co-workers 6.1 5.6 02 
Pay 6.3 6.5 .30 
Supervisor 6.3 5.2 01 
Hours 6.9 6.3 01 
Working Conditions 7.3 6.2 01 
Benefits 7.3 8.2 01 





Men were more interested than women in security, advancement, and 
benefits; and women were relatively more interested in type of work, co- 
workers, supervisor, hours, and working conditions. These differences 
lend support to the hypothesis that the typical woman is interested in 
working for a relatively short period and is not as seriously interested as 
men in making long-range work plans. 

Marital status had relatively little effect on job preferences, although 
existing differences were basically in the same direction as sex differences 
in that single men tended toward the preference direction shown by 
women. 

As the number of dependents increased, greater relative importance 
was attached to security, company, co-workers, supervisor, and benefits; 
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and less importance was given type of work, pay, hours, and working con- 
ditions. Changes in preference sometimes appeared suddenly in one or 
two places (see Type of Work and Company) and reversals were present 
in the case of some factors (see Hours). 

Contrary to popular opinion, security did not increase in importance 
with advancing age and advancement did not decrease. For the most 
part, changes in job preferences accompanying age changes were not 
linear trends, but occurred suddenly and were accompanied by reversals. 

Job preferences were affected more by extent of education than by 
most of the other variables. Advancement, type of work, pay, and 
working conditions became more important as extent of education in- 
creased; and security, company, co-workers, supervisor, hours, and bene- 
fits became less important. Changes were not always gradual, the points 
of high school graduation and college attendance being particularly im- 
portant. 

Comparing sales applicants with mechanical applicants, the former 
were relatively more interested in advancement, type of work, company, 
and pay; and relatively less interested in security, co-workers, hours, and 
benefits. Clerical applicants showed a mixture of sales and mechanical 
preferences, being most like sales applicants with respect to company and 
hours, and most like mechanical applicants with respect to co-workers 
and pay. Within the area of mechanical work, skilled applicants were 
more interested in advancement and type of work than were unskilled 
applicants, and less interested in co-workers, supervisor, and hours. 


Uses for Data 


These data are useful for many purposes. They provide a background 
for collective bargaining which is based on actual findings rather than 
predictions which may be grossly inaccurate. They are valuable in the 
determination of personnel policies and procedures. For purpose of 
employee recruitment they facilitate the writing of advertisements which 
emphasize those factors which are ranked highest by most applicants. 
They are also valuable for supervisory training by increasing the supervi- 
sors’ understanding of what factors are deemed most important by em- 
ployees. 

Job Preference blanks as filled in by specific individuals also have 
various uses; for example, as an interview aid. A high ranking of “type 
of work” provides an opening for discussion of what type of work the 
applicant would most like to have, both for the present and for the future. 
High job preference for co-workers provides a basis for discussing the type 
of person with whom the applicant works best, the type of person apt to 
irritate the applicant, etc. A high rating for supervisor can be used to 
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elicit a similar type of information in that area. Data of this type can 
be exceedingly valuable in securing information on sociability, maturity, 
stability, work habits, etc. 

A technique has been developed to score job preference blanks in the 
same manner as other employment tests, and validities are now being 
obtained. 

A comparison of an individual’s ranking of the factors from the view- 
point of preference and again from the viewpoint of present satisfactori- 
ness should shed light on employee morale as it relates to the particular 
factors listed on the blank. If the “preference” ranking is higher than the 
“satisfaction” rating it would appear highly desirable for the management 
to improve conditions relating to that factor. 


Conclusions 


From these data it would appear that both management and labor 
leaders (individually and collectively) have erred considerably in their 
statements and demands with regard to job preferences of applicants from 
whom employees are selected. Union agreements typically emphasize 
“‘wages, hours and working conditions,” which factors did not turn out 
to be the most important in the opinion of job applicants. In negotiation 
of union contracts, as well as in day-to-day relationships between manage- 
ment and union, considerable emphasis is placed on security and benefits. 
Security would appear to warrant such emphasis, but benefits decidedly 
does not warrant such emphasis on the basis of desire although it may be 
warranted on the basis of need. In this case it would appear inadvisable 
for either the company or unions to expect appreciation to any great 
extent. 

Discrepancies between demands of union officers and preferences of 
applicants can be explained by any or all of the following hypotheses: 
(1) unior. officials do not know what employees desire most, (2) union 
officials are more interested in what they believe employees should have 
rather than desire to have, or (3) union officials are more interested in gain- 
ing relatively unimportant but tangible items which can “sell’”’ the union 
to employees than in more desired but less obvious and less tangible gains. 
It is interesting to speculate on the ultimate effect on unions of their 
failure to emphasize the factors most important to employees, This 
failure assumes greater importance in view of the recent swing of public 
opinion and legislation against unions. It may become necessary for un- 
ions to shift their emphasis if they are to retain their members. 

Management also has erred in many respects. The type of work 
being done by the employee is frequently considered by management to 
be of little importance to employees except in terms of gross classification 
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such as sales, mechanical, clerical, administrative, etc. Results obtained 
here indicate the contrary. Mean ranks obtained in this study are in- 
consistent with management’s practice of making transfers or promotions 
with little discussion with the employees involved. Although manage- 
ment may contend that few employees verbally object to such changes, 
this cannot be interpreted as absence of preferences or dislike. It is 
highly probable, in light of these data, that employees refrain from object- 
ing to such changes on the basis of fear of consequences (such as incurring 
displeasure of management) or feelings of futility. Such fears and 
feelings do not tend toward favorable employee morale, and are apt to 
find direct or indirect expression at some later date. Unions have also 
erred in failure to emphasize the importance of type of work. Since this 
factor is of such great importance to large numbers of persons, it is likely 
that union leaders could better convince persons of the merits of union- 
zation if through such organization they could more readily obtain better 
liked work. It would appear highly profitable for representatives of 
management and unions cooperatively to develop methods and procedures 
which would insure better placement of employees in the type of work 
which they would most enjoy. 

These data also indicate that management should spend far more time 
and effort than is customarily expended to “sell’’ the company to its 
employees. There is greater need for house organs, bulletin boards, 
employee induction manuals, and other communication lines which are 
used to give information to employees which increase their pride in their 
company. 

The importance attached by applicants to a good supervisor indicates 
the value of supervisory training, especially with regard to improving 
their techniques for handling employees. 

In summary, both management and unions have been guilty of empha- 
sizing factors which are considered relatively unimportant by applicants, 
and have failed to give sufficient consideration to factors considered of 
most importance by applicants. This mutual failure would seem to 
provide an excellent opportunity mutually to devise principles and pro- 
cedures which will result in greater job satisfaction on the part of em- 
ployees. 

Data obtained in this study are at variance with what many persons 
would anticipate through arm chair reflections as well as with common 
practice in business and industry. For this reason it appears that more 
attempts should be made to determine facts experimentally which pertain 
to personnel work. Only by so doing can personnel work achieve a high 
degree of efficiency. 


Received December 11, 1946. 
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Men and Machines* 


Jack W. Dunlap 
Division of Bio-Mechanics, The Psychological Corporation, New York City 


The development of industrial psychology in the past quarter of a 
century has been rapid but sporadic. Critical demands during each of 
the two world wars accentuated this development, but in different ways. 
From a psychologist’s point of view, the outstanding need of World War 
I was to train masses of men quickly for different types of duty. This 
need stimulated the evolution of psychological methods of selection and 
placement. The practical value of these procedures was appreciated by 
industry almost immediately, and in the post-war period, many psycholo- 
gists were invited to apply their newly acquired techniques to business 
and industrial problems. In time, they added such other activities to 
their practice as progressive managers permitted. Industrial psychology — 
was, and is yet, predominantly a selection psychology. 

The distinguishing feature of World War II from a psychologist’s 
standpoint was its highly technological character. In this war, greater 
masses of men were trained for a greater variety of duties than in World 
War I, and the contributions of American psychologists in their selection 
and training have been well documented. 

In the earlier part of the war, many different kinds of specialists 
sought to devise more efficient machines for seeking out the enemy and 
destroying him. Many of these devices embodied highly desirable tech- 
nical features which were impractical because the operation of these 
devices was too complex for the average man in the service. With the 
largest American army and navy in history, there weren’t enough suffici- 
ently skilled men to operate or repair these devices. Often, although 
suitable untrained personnel could be obtained, the time required for 
training was excessive. Some radar repairmen, for example, needed more 
than fourteen months of training. Even then, additional practical ex- 
perience was considered desirable before they were ready for overseas 
service. Under the exigencies of war, many of them acquired their 
practical experience overseas. 

Shortly after the beginning of the war, a small group of psychologists 
was asked to seek ways of adapting equipment to suit the operator instead 

* Presented as the Presidential Address before the Division of Consulting Psychology 
at the 55th annual meeting of the American Psychological Association on September 9, 


1947, at Detroit. 
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of selecting men to fit equipment. The results were so significant that 
that the term “human engineering” began to creep into the vocabulary 
of line officers. Before the war was ended, line officers not only listened 
to these psychologists, but exhibited a faith in their powers to a most 
flattering, if unwarranted, degree. Such psychologists did more than 
articulate nicely the machine and its operator. They were concerned 
also with the way one machine operator articulated with another close 
by, with the kind of information he needed to operate his machine, with 
how quickly he could get this information with a minimum of error, with 
the optimal conditions of operation, and other factors. Starting with an 
interest in the interrelationship of men and machines, psychologists ap- 
plied many of their professional techniques to problems which had only a 
secondary association with machines. The approach was different from 
that of the time-and-motion engineer. The time-and-motion engineer 
traditionally treats the machine as a constant and man as variable. Op- 
timal movements were explored which were most suitable to continued 
operation on a particular machine by a statistical or noncorporeal “‘aver- 
age man.” This group of engineers has contributed to the redesign of 
some equipment, but this activity has been marginal to the main trend of 
their work. World War II, then, accelerated the beginnings of a new 
branch of industrial psychology. It has been called “‘bio-mechanics,” 
“human engineering,” ‘“‘bio-technology,” “‘psychophysical-systems re- 
search,” and other names. They all describe the planning of the ma- 
chine, as E. F. DuBois has put it, “from the man outward, considering 
the instruments and controls to be extensions of the man’s nervous 
system” (2, p. 15). 

The success of military and civilian psychologists in human engineer- 
ing was acknowledged by design engineers and consulting groups working 
with the services. Their continued interest in human engineering has 
serious implications for psychologists in these ways: 


1. Education of engineers along psychological lines. 

2. Training psychological personnel as human engineers to participate 
in the design of new equipment. 

3. Modification of the duties and capacities for service of the in- 
dustrial psychologist. 


At the age of 40 years, 60 per cent of engineering graduates are in 
positions entailing administrative responsibility, according to Taylor 
& Boelter (9). This coincides with the aspirations of engineers as re- 
vealed by Karl Compton (1, p. 71): ‘On the walls of the national head- 
quarters of the engineering societies in New York, there hangs the defi- 
nition: ‘Engineering is the art of directing men and controlling the forces 
and materials of nature for the benefit of the human race.’ While some 
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may feel that this definition is too broad and may cite examples of men 
not called engineers, who ‘direct men and control (or try to control) the 
forces and materials of nature for the benefit of the human race,’ never- 
theless, such men are really operating in the field of the engineer.” 

It is perhaps superfluous to point out that many engineers do a rela- 
tively fine job of administration even though they have had virtually 
no formal training in the “art of directing mén.’’ Engineers have re- 
ceived little formal training in this art because there has been too little 
objectively verified material for them to learn. The industrial strife 
of the past decade perhaps made them realize that the handling of human 
material sometimes decides the quality and efficiency of an engineering 
enterprise. §S. A. Lewisohn, in his excellent book ‘‘Human Leadership 
in Industry, the Challenge of Tomorrow,” defines the problem clearly 
(7, p. 48). He quotes Herbert Hoover as stating: “In these days of 
largely corporate proprietorship, the owners of mines are guided in their 
relations with labor by engineers occupying executive positions. On 
them falls the responsibility in such matters, and the engineer becomes 
thus a buffer between labor and capital.” 

Lewisohn continues: “The question is: What preparation have they had 
to act as such a buffer? A background limited to physics, chemistry, 
mathematics, mechanics, and other specific sciences does not equip a man 
to act as a buffer between labor and capital . . . for their present re- 
sponsibilities some training in psychological problems and the mental at- 
titudes of men, some knowledge of modern sociological tendencies, some 
grasp of the incentives that make men act, some acquaintance with the 
purposes of trade-unions and the art of collective bargaining, and some 
understanding of the technique of human engineering are indispensable.” 

In browsing through issues of Mechanical Engineering for the year 
1946, I found no less than twenty-two articles expressing the engineer’s 
concern with the problem of handling men. It is a tribute to the engi- 
neer that he recognizes his shortcomings and would like to do something 
about them. Articles on this subject were only slightly less common 
than those dealing with problems of atomic energy. Some of these arti- 
cles merely specified that a problem exists and something should be done 
about it. Some writers recommended broadening the humanistic-social 
base of engineering education. Two articles contained rule-of-thumb 
techniques of human management. Two other writers stressed the need 
for management research and invited the engineer to apply his engi- 
neering training to such research. 

In none of these articles was there a report on the results of an experi- 
mental approach to the problem under the controlled conditions with which 
we are familiar in experimental psychology. 
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I think this is significant because the engineer has been trained to 
solve problems by experimentation. Idle discussion is foreign to his 
tradition. These articles indicate that the administrative engineer wants 
to solve social problems and lacks the know-how to solve them. This 
deficiency is not to his discredit. Sociologists, economists, labor-relations 
specialists, and psychologists have puzzled over these problems for years 
and have made not much greater headway. Clearly, this is a vast prob- 
lem, which requires a co-operative approach by all of these professions. 
In my opinion, the psychologist can make a particularly strong contri- 
bution to their solution, not because he possesses any peculiar set of facts, 
but by the application of principles and techniques developed or tested 
during World War II. The psychologist should have the opportunity 
to apply his knowledge to this industrial problem on a sufficiently large 
scale to make his research significant. 

Broadening the training of the engineer in social-humanistic subjects 
is a current trend. Rennselaer Polytechnic Institute has increased by 
one-third the time given to psychology and other humanistic studies. 
Newark College of Engineering has reorganized its curriculum to place 
greater emphasis on human problems. The Department of Engineering, 
University of California, Los Angeles, California, is expanding its cur- 
riculum to include two “bio-technical’’ courses. The first of these, The 
Dynamics of Human Function and Behavior, deals with practical psychol- 
ogy, and with the physical structure, thermo-dynamics, and machinery 
of the body. The second course, The Influence of Environment on Man, 
delineates his interaction with the atmospheric, thermal, bacterial, radia- 
tional, and chemical aspects of the environment, and includes a study of 
socioeconomics (9). The psychologist, at this time, can make the soundest 
contribution to the content of such courses by exploiting the field of bio- 
mechanics. 

The engineer can no longer be satisfied with “‘fatal’’ limits in extending 
his control over the environment of man. Dr. DuBois outlines the idea 
in this way: “Perhaps we are wrong in trying to place a sharp limit on 
the factors of safety and devote so much attention to the fatal level. It 
may be useful to know the fatal level, but the engineer should concentrate 
on the levels when men first become inefficient. If a certain machine, 
such as an airplane, is under the control of a man, its efficiency corre- 
sponds with that of the man. If in a sharp turn a pilot loses his vision, 
both man and machine are blind. If the pilot at an altitude of 38,000 
feet develops intolerable bends, he and the plane must descend to a much 
lower altitude. At extremely high or low temperatures, there is a marked 
loss of mental capacity as well as loss of muscular power and control . . .” 
(3, p. 627). The psychological literature contains many verified data no 





Men and Machines 569 


the effect of environmental change on performance. Such information 
should be helpful to an engineer’s training. 

Psychologists at universities with engineering schools may anticipate 
a demand for courses in psychology suitable for engineers. Such courses 
would (1) make available to the engineer pertinent psychological data; 
(2) impress on the engineer the importance of considering limitations of 
the operator in designing equipment; (3) describe to the engineer the kind 
of research techniques that have been useful to the psychologist; and (4) 
form a basis for mutual understanding, respect, and intelligent co-opera- 
tion on problems peculiar to each group. 

Some psychologists have expressed concern about the possibility of 
competition between the engineer and the industrial psychologist. I 
believe this concern is groundless. Engineering schools have enough 
trouble in training a good engineer in five college years; seven or eight 
years is not too long a time for training a good psychologist. Ultimately, 
the engineer and the psychologist will have to work together, at least in 
the bio-technical field. The technical scope is too vast for one type of 
professional person. It would be wiser for the psychologist and the 
engineer to acquire a mutual appreciation of their technical skills and 
understand their own limitations. 

To summarize what I have said, engineers have educed an acute in- 
terest in the techniques of dealing with men. Through their experience 
in World War II many engineers have been impressed by the psycholo- 
gist’s contribution to machine design. In fact, during committee de- 
liberations in 1945-46, regarding the establishment of a Science Re- 
search Foundation, engineers supported psychology as a science to be 
included with the physical sciences. Psychology must become a part 
of the curriculum of the engineer, and psychologists will be expected to 
co-operate in preparing or giving suitable courses. 

Let us consider now the opportunities for the psychologist trained in 
human-engineering techniques. A great amount of work on equipment 
design, carried on by military agencies, has been summarized by Kappauf 
(5), Taylor (8), Fitts (4), and Kelly (6). The opportunities for em- 
ployment by these service agencies require no further elaboration. 

We may be about to experience the greatest change in style of life 
since the Industrial Revolution. The full utilization of atomic energy for 
peacetime pursuit will bring us close to supersonic speed, inter-stellar 
space and exploitation cf the mineral wealth in the arctic zones. Man 
will work and live in strange worlds, thus new and unusual problems will 
confront the psychologist. For example, consider a simple problem in 
vision in thrust craft operating at a speed of 2,000 miles per hour, or about 
3,000 feet per second. Visual stimuli initiate nerve impulses which reach 
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the optic cortex in about 0.05 second. If an observer in such a craft 
were to look at an object directly abeam, he would have flown 150 feet 
beyond it before his brain registered the stimulus. Such an observer 
would probably feel a bond of sympathy with the mythical bird who 
flies backwards because he wants tu see where he has been. Seriously, 
the design of control devices by means of which an operator can govern 
such craft offers a real challenge to psychologists. Because of the high 
~’ .wudes at which such craft may operate, the entire field of visual science 
will have to be restudied, for there is reason to believe that the upper 
sky is always dark. This, in turn, poses problems in high-altitude navi- 
gation and camouflage. 

As natural resources are consumed, man will be tempted to find and 
remove the wealth of the arctic regions. This, in turn, will create an 
extensive series of physiological, sociological, and psychological problems 
with regard to the action and interaction of human organisms under ex- 
treme environmental conditions. These and hundreds of other similar 
problems may seem fantastic, but I am convinced that most of us will 
live to see them treated as routine. 

Fascinating as it is to speculate about our future, there are many in- 
dustrial problems which require the skill of psychologists right now. 
There are few industries which appreciate the usefulness of the psycholo- 
gist as much as the aircraft industry. Yet there is recurring evidence of 
the need for more of the psychologist’s services. One day, not so long 
ago, I visited a plant in which jet fighter planes are being built. As I 
stood near the end of a runway, I saw a towing crew working on a crum- 
pled automebile. There were other crumpled cars near it, and my curi- 
osity was aroused when I counted 12 badly damaged cars in that parking 
lot. I learned that the day before one of the jet planes was rolling along 
the runway just after landing and that the brakes when they were ap- 
plied by the pilot would not hold. It must have been embarrassing to 
him, for many of those planes land at 120 miles per hour or more. Why 
the brakes failed does not matter to us, so let us consider what the pilot 
could >ave done. He could have ground-looped, that is, touched one 
wing to the ground and taken his chances of survival, but there were 
workmen along the edge of the strip, and they might have been injured. 
It appeared that the wisest thing to do was to retract the wheels and slow 
down by skidding along the plane’s fuselage. In this particular plane, 
the “‘wheel up” lever is behind and to the left of the pilot’s seat, so that 
it is operated after a blind search by “feel.”” To make certain that 
is is not inadvertently tripped by the elbow during flight, a safety 
cover has to be raised before the lever can be operated. The pilot tried 
to manipulate the switch, failed, and tried again, but by this time, he 
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had rolled off the runway and into the parking lot. The plane smashed 
into car after car and finally turned over, supported by a car at each wing 
tip. Fortunately, the pilot escaped with minor injuries and climbed 
out of the plane before rescuers arrived. ‘Two things seemed clear; there 
should be emergency brakes, and if ‘‘wheels up”’ is to be considered as an 
emergency landing procedure, this control should be placed where the 
pilot can get at it quickly. This control is in an inconvenient and danger- 
ous position, and yet it has to be used twice in every flight. After the ac- 
cident, it was obvious that the control lever was improperly placed. This 
example emphasized the need for examination of equipment in the design 
stage from the view point of human abilities and limitations. During 
the war many psychologists helped engineers by examining new equip- 
ment in an attempt to prevent just such occurrences. 

The importance of vision in industrial plants has been dramatized by 
dispensers of safety goggles by means of such instruments as the “Sight 
Screener,’”’ and the “Orthorater.”” These are not the only ways in which 
the psychological principles of vision can beemployed. The lawof contrast 
has been used widely by safety engineers in painting dangerous areas and 
machines. The same law can be applied to the function of a machine. 
In the pharmaceutical business, “paddles” are employed for counting 
tablets. A paddle is an unpainted piece of aluminum or wood in which 
a predetermined number of holes have been bored part of the way 
through. The holes are slightly larger than the tablets to be counted. 
The operator shoves the paddle into a container of tablets to pick up a 
load, and then shakes it with a wrist motion until each hole contains a 
tablet. This mechanical method of counting is simple and efficient, but 
it is not foolproof. Counting errors are made, and these appear to be 
the fault of the operators. An operator occasionally fails to perceive 
that one of the holes does not contain a tablet and pours a short count 
into the container. This perceptual error is due, in part, to the fact that 
the color of the tablets and of the paddle do not contrast sufficiently. 
Inspection precision was improved simply by placing a spot of contrasting 
color, somewhat smaller than the diameter of the tablets, in the bottom 
of each hole. 

Motor habit patterns have been investigated by psychologists for de- 
cades. A great deal is known about establishing such patterns quickly 
and efficiently, but very little work has been done on the extinction of 
motor patterns. There is a real and practical need for additional ex- 
perimental work on the extinction of habit patterns, for the results can be 
applied directly to industrial and military situations. In cases of emer- 
gency, surprise, or fatigue, an individual tends to revert to earlier inap- 
propriate motor patterns. This fact is not well known, nor is it usually 
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considered by designers of equipment, possibly because the problem has 
not been recognized by engineers. Today, a driver can operate any 
American passenger automobile without being confused by the gear shift, 
but only a short time ago there was the Buick shift, the Dodge shift, the 
standard shift, and the Ford planetary drive. The lack of standardi- 
zation creates an accident hazar“ based on motor habit patterns. That 
it is not a problem of the past was shown by the recently publicized ac- 
cident of the Royal Dutch Airlines, in which the pilot pulled what he 
thought was the flap-retractor lever but actually was the landing-gear 
lever. The result was a serious accident in which several persons lost 
their lives and the plane was almost completely destroyed. The pilot 
“knew” where the flap retractor lever was; he had been checked out in 
the plane; he was familiar with it, but he reverted to an old motor habit 
established in another type of plane. This is not an isolated error; it 
has occurred again and again, both in civilian and military aircraft. 
Pilots flying aircraft with which they are relatively unfamiliar have 
switched to empty gas tanks, cut the ignition of good motors when they 
intended to cut bad motors, or have feathered the wrong propellers when 
engines failed. Such errors can be obviated by standardizing the cock- 
pit, and by the application of psychological data to the principles of 
design as they relate to the operator. Elimination of earlier motor pat- 
terns is not so simple, but the need for work on the general problem is 
extremely urgent: 

The entire transportation industry is filled with problems concerning 
the individual and the machine, not only from the standpoint of operation 
and safety, but also that of comfort of the passenger. Railroads have 
been severely criticized for their lack of consideration for the passengers’ 
comfort. True, railroad companies have promised wonder trains for the 
post-war era, but for the most part, these are still on the drawing boards, 
or at least only pilot models are available. The problems of lighting, 
noise, and heating cannot be solved by current engineering methods alone, 
but must be solved, in part, in terms of the human factor. Reducing 
the noise level in a railway coach to a given number of decibels is not 
enough. Noise should be expressed in such terms as: “Is the noise level 
sufficiently low so that passengers sitting in adjacent seats can converse 
in low normal tones?” I could go on discussing this specific problem, 
but my purpose is merely to stimulate psychologists to think ir terms of 
applying their knowledge to transportation problems. When they do, 
they will make a real contribution to the travelling public. Buses and 
street cars, and that mighty denizen of the open road, the truck, all need 
careful scrutiny. Some problems are common to all of these, but each 
has large groups of problems peculiar to itself. For example, consider 
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cross-country trucks, with their problems of relief drivers, fatigue, road 
strain, cab design, seat design, and bunks for relief drivers, to mention 
only a few of the problems involving the operator which need to be 
studied. 

The radio-manufacturing industry can profitably use the services of 
design psychologists. One company questioned the advisability of auto- 
matic volume controls in receivers. The first question their engineers 
proposed to the psychologist was, “Do radio listeners need and want 
automatic volume control?” It was found that listeners desired this 
feature, so the next question was, ‘‘How sensitive must the control unit 
be?” Basically the problem was to determine the point at which in- 
dividuals would adjust volume control, regardless of the level at which 
they initially set the volume. Experiments proved that the volume 
could vary as much as four decibels before the listener would readjust 
the volume. The psychologist thus was able to provide the engineer 
with definite limits of sensitivity for designing the automatic volume 
control unit. 

The psychologist has given little thought to the architecture of homes, 
stores, office buildings, theatres, and other places of public assembly. 
One need only examine any major occupation, such as transportation, 
building, manufacturing in all its forms, or even farming, to observe in- 
numerable problems involving the inter-action of men and machines. 

Within a short time, therefore, I predict that more psychologists will 
be sought for the solution of bio-mechanical problems on a full-time basis 
in the automobile, radio, home appliance, transportation, and other in- 
dustries. Their principal contribution will be to provide research in 
equipment design. 

Equipment design is but one aspect of bio-mechanics. You will re- 
call, that during World War II, the human engineer started with a narrow 
man-machine relationship and found it necessary to extend his study. 
The man-machine relationship was a point of departure. The influence of 
World War II on industrial psychology probably will be even gre>ter than 
the influence of World WarI. It is my opinion that industry is aware of 
the many problems which can be solved by human engineering. Mana- 
gers will seek industrial psychologists for this type of service. The in- 
dustrial psychologist will have to add to his technique of selection and 
training the techniques of adapting machines to man. 

In the broadest sense, the industrial psychologist is concerned with 
the development of a more productive society. Greater attention must 
be given to the articulation of man and machine to achieve greater in- 
dustrial production, and so, the industrial psychologist must be prepared 
to delve into the remotest phases of this particular problem. It is often 
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difficult to determine where bio-mechanics ends and other aspects of 
industrial psychology begin. For example, deliberate slow-downs may 
be caused by the application of time-and-motion methods, or by the 
installation of more efficient equipment. The attitude and morale of a 
worker, then, is important. A tire manufacturer installed some new 
equipment. Motion studies in other plants had shown the average 
worker could process 24 tires per hour on this equipment. After train- 
ing, he found his workers were producing only 17 tires an hour and later 
the average dropped to 13. A review of the work method indicated minor 
adjustments, including additional pay, which finally brought production 
up to 15 tires per hour. The plant never reached 24 per hour because 
of a deliberate worker slow-down to prevent the possible lay-off of some 
workers. Clearly, that manager failed to recognize the psychological 
effect of new equipment and new methods on the social environment of 
the workers. If these problems, with their social overtones of the rela- 
tionship between men and men (and I say men and men advisedly, not 
labor and management), had been recognized as were the relations be- 
tween men and machines, it would have been possible to develop a 
transition program for the workers, based on psychological principles. 
Such a situation can be alleviated after it occurs, but only at considerable 
expense in terms of human emotions and production losses, which could 
have been avoided by proper preparation. 

A great deal is written about industrial safety programs and the use 
of automatic safety devices. There is no question about the need for such 
devices, but the inter-action of men and machines is not perfect. Recently 
I was advised of an amusing incident related to safety devices, and I say 
it was amusing because no one was seriously injured before the source of 
difficulty was identified. In a plant which prides itself on its safety pro- 
gram, women operate a heavy stamping machine. When the power 
pedal is activated, a vertical shield drops in front of the mechanism and 
an upright bar moves horizontally across the face of the machine to shove 
aside the hand of the careless operator. These safety devices were 
painted in conspicuous colors, for the plant makes wide use of color in 
its safety program. One day, shortly after the equipment was installed, 
a girl operator suddenly fainted. Thereafter, scarcely a day went by 
when one or more girls did not give up and quietly slide to the floor. No 
amount of physical examination or study of medical history gave any 
clue to the phenomenon. Finally, a psychologist who had worked for 
the plant was called in and succeeded in finding the cause after a little 
study of the working conditions. The monotony of the task, and the 
movement of the colored safety mechanism set up a condition required 
for hypnosis. Once the difficulty had been identified, the solution was 
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easy. I might say the the psychologist’s professional stature was not 
reduced. - 

Safety devices for emergency control are not adequate unless they are 
immediately and readily available to the operator. Often such controls 
are added as afterthoughts, and are located because of engineering con- 
venience rather than the convenient use of the operator. During the war 
a series of accidents occurred in Navy planes, in which the pilot’s head 
came in violent contact with the gun-sight during landings aboard car- 
riers. To eliminate such accidents, a shoulder harness was devised to 
hold the pilot so that his head could not come in contact with the gun- 
sight. Unfortunately for the pilot, it was necessary for him to lean for- 
ward and down into the cockpit to adjust a !ever during landing. It was 
impossible for him to do this while he was in the harness, so he released 
the harness and took a chance with the gun-sight. The shoulder harness 
was perfect for the job it was designed to do, but the engineer forgot 
the man and what the man had to do. 

Recently, I heard of an interesting problem in “‘selection.”’ An in- 
dustrial organization had an entire contract cancelled because five items 
in a lot of 100 gross were defective. These items were worth only a 
fraction of a cent each, but the full contract ran to thousands of dollars. 
The management decided, ‘What we need is a good selection system for 
inspectors.” Therefore, a psychologist was called in to establish a selec- 
tion system. The first question to be answered was “‘Selection for what?” 
So, a thorough study had to be made to ascertain why defective items 
slipped by the inspectors. The items passed before the inspectors on a 
conveyor at the rate of 200 per minute. If an item was defective, the 
inspector removed it from the belt and placed it in a container at her 
side. When the container became heaped up, she would bend down 
and level the pile of material, and her eyes were away from the belt for 
from one to ten seconds. Many of the younger inspectors were likely 
to watch the young foreman as he moved about the department. Al- 
though selection might help, particularly if elderly, unattractive men were 
the object of selection, the fundamental problem was to determine the 
“attention distractors” and to find means for reducing or eliminating such 
distractions. Some remedies are immediately obvious, such as rearrange- 
ment of the work stations with regard to the disposal of defective items, 
training in the use of peripheral vision for handling defective items, and 
so on. 

The principles underlying bio-mechanics and the techniques employed 
by workers in this field can be applied to the practical problem of con- 
serving raw materials and increasing production. An ever-present, costly 
problem confronting a manufacturer is that of quality control, which 
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can be restated as a problem in the control of the variability of the prod- 
duct. This variability is a function of the raw materials, the machines on 
which the product is fabricated, and the workers who fabricate the 
material. It is a common practice to examine the variability of the raw 
product, and in many plants, to exercise some control over the machines. 
Only rarely is the human factor seriously considered as a contributing 
cause, and this is particularly true if the machines are automatic or semi- 
automatic. The first problem is to identify the contribution of each of 
these factcrs to the total variability. The part of the total variability 
which is a function of the inter-action between the three major sources of 
variation should then be determined. 

During the past months several members of the Division of Bio- 
Mechanics have worked in greige mills producing nylon hosiery. The 
general objective was to increase production and decrease consumption of 
raw materials. A series of carefully controlled investigations was in- 
stituted to determine what part of the total variation could be attributed 
to the raw nylon yarn, the machines, and to the operators. They found 
that the largest sources of error were contributed by the machines and 
the operators. Over-all plant losses as high as 8 per cent were caused 
by these two factors in addition to the normal wastage factor, and often 
neither their source nor nature was suspected. Further studies revealed 
more than a hundred ways in which well trained knitters unwittingly 
can cause their stockings to vary. For young or inadequately trained 
knitters, the variation was exaggerated. 

Once the relative contributions of these factors had been determined, 
the problem resolved itself into preparing a remedial program for the 
operators and the equipment. The first step was identifying the various 
operations which the knitter could perform and determining their effect. 
Once this was done, a sound, practical, and easily administered retraining 
program could be devised. This approach to the control of quality is a 
completely general method, and can be applied to many kinds of pro- 
ductivity other than the manufacture of hosiery. 

These are but a few examples of the variety of problems with which 
the “human engineer” may be confronted. The overlap in approach to 
industrial problems by the human engineer and the industrial psycholo- 
gist is not the only reason for a modification of the practices of the in- 
dustrial psychologist. 

When managers require psychological services, they usually do not 
know enough about the specialities of the industrial psychologist to seek 
one type of person to handle a problem of selection, another a problem of 
human engineering, and a third a problem of training. Managers 
expect a psychological agency to handle most of the problems they regard 
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as psychological. Correspondingly, it is reasonable to anticipate that 
managers will expect the industrial psychologist on their staff to become as 
competent in this new activity as he is in selection and training. In the 
current operation of the Bio-Mechanics Division of the Psychological 
Corporation, we have found it necessary, at times, to go into problems of 
training, of selection, of location of work stations, of the study and modifi- 
cation of other physical aspects of the environment such as temperature, 
ventilation, noise level and vibration, lighting and color. We found it 
necessary to concern ourselves with the breakdown of a task for more 
equitable and more efficient distribution of effort among men and women 
working on machines. We were also concerned with the techniques of 
operating machines, of worker morale, of labor-management relations 
and time-and-motion studies. Indeed, we are expected to tackle almost 
any problem of production in which our understanding of the functions 
and limitations of human beings may lead to a solution. 

All of us have seen machines grow in complication until they have 
become a psychological burden to the average operator. It has been 
extremely important to collect scientific information about the interrela- 
tionships between men and machines with the objective of enhancing the 
efficiency of the machine and the comfort of the operator. Perhaps a 
more subtle but no less complicated alteration has been made in the 
social structure of our agencies of production. It is possible that 
most of the strife between labor and management, so costly to society 
as a whole, is a symptom of a too complicated or an inadequate social 
‘milieu in the industrial plant. Perhaps the mutual isolation of labor and 
management, of labor’s insufficient feeling of participation in the produc- 
tion of the finished item, of an insufficient feeling of prestige and worth, 
to mention but a few possibilities, are determining factors. Here, too, we 
need to collect scientific data to determine whether the social structure 
of the factory has impinged upon a human limitation, and if so, where; 
we should not limit ourselves only to sensori-motor limitations. We 
need precise information about the differences in the morale of workers at 
different plants. The psychologist has a special advantage in approach- 
ing such problems by virtue of the rich techniques which are part of his 
tradition. 

Today, those of us who are working in bio-mechanics have drifted 
into the field on currents activated by a wide variety of interests. None 
of us received systematic training for this work and only chance or tem- 
perament have fitted us for it. We should now consider more systematic 
training for this aspect of applied psychology. 

As a starting point for a discussion of a course of training, I believe 
that most of the basic courses are now being given by many departments. 
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The names of such courses are not sufficient; it is the emphasis given in 
such courses that is critical. For example, there are numerous courses 
in introductory statistics, such as statistics in agriculture, in education, 
in psychology, in economics, or in biology, and all of these basic courses 
contain essentially the same fundamental logic of analysis and elementary 
formulae. Yet, a student of one of these courses finds it difficult to apply 
his knowledge to the problems in another field. This difficulty de- 
pends on the direction of the course, which usually is determined by the 
experience of the instructor and the examples used for demonstration. 
Thus, we do not necessarily need new and different courses, but new 
examples with a new emphasis on the applications of the skill and tech- 
niques developed by the courses. The present courses of business and 
industrial psychology, physiological psychology, experimental psychology 
with a particular emphasis on practical problems, educational psychology 
with emphasis on the problem of learning and retraining, statistics, tests 
and measurements, interests and attitudes, and a course in the design of 
experiments all provide basic training for work in bio-mechanics. Such 
courses are not sufficient, however, unless they are deliberately aimed to- 
ward the application of psychology in human engineering. Certainly 
the field is sufficiently well defined to work out a year’s graduate course 
in bio-mechanics. Ultimately, there is no substitute for practical ex- 
perience. We, in the Bio-Mechanics Division of the Psychologieal Cor- 
poration, have been happy to contribute to a partial solution of the prob- 
lem by providing interneships for properly qualified graduate students. 
In addition, until there is enough literature on the subject, it would be 
desirable for some university staff members teaching graduate courses 
in statistical, industrial, physiological or experimental psychology to 
spend a year with some agency involved in a variety of human-engi- 
neering problems. 

In summary, the development of psychological activities during the 
latter part of World War II has implications for the expansion of psy- 
chological training in the education of the engineer, the development of a 
new specialty which involves the application of psychological data and 
principles to equipment. design and operation, and the future develop- 
ment of industrial psychology. 
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Learning in Accident Reduction 


Edwin E. Ghiselli and Clarence W. Brown 
University of California 


On any job new employees are faced with tasks which in some way are 
unfamiliar tothem. While formal inservice training attempts to familiar- 
ize them with the new problems and aids them in developing the necessary 
skills, it probably can never wholly substitute for that which the new man 
learns on his own by actually performing on the job. Regardless of the 
nature of the task, then, informal on-the-job learning may be expected 
to occur. 

In the transportation industry a peculiar situation exists in regard to 
learning and training. The skill of the operators is measured in a manner 
both objective and consequential by means of the number of accidents 
incurred.. Men with too high an accident rate, even those who are 
learners, cannot be permitted to operate public conveyances on the city 
streets. Obviously, therefore, the formal inservice training program must 
carry the novice to such a point in learning the new skills connected with 
the job that the amount of informal on-the-job learning is practically 
nil. With most city transit organizations the formal training period for 
street car motormen and motor coach operators generally ranges from 
two weeks to a month, about half of which is “class room” instruction 
and half supervised operation on the job. 

Psychologically, the task of operating a street car or motor coach ap- 
pears to be far from simple. The superficial aspects of the job are readily 
mastered. Adequate knowledge of operating regulations as measured 
by objective tests is learned in a few days. Observations of new workers 
and performance on the job by psychologists indicate that the actual 
motor coordinations involved in the operation of the controls of the 
vehicles are accomplished in less than an hour’s time. This results from 
the facts that for street car operators the motor coordinations are of a 
simple nature and so readily learned and for motor coach operators that 
previous experience in driving is required before the applicant is accepted 
and so the fundamental motor coordinations already are learned to a fair 
degree before formal training on coaches is given. It is the complex 
activities requiring judgments of speeds and spatial relations, and the 
division of attention, performed under conditions of stress which form the 
most difficult aspect of the job to learn. Yet in formal training programs 
in the transit industry this is the aspect of training which receives least 
consideration. In part this is due to the fact that instructors ordinarily 
are simply those operators” with longest experience and most patience, 
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and in part to a lack of detailed and systematic psychological analysis of 
the task. 

It would appear desirable to obtain some information on the course 
of improvement of new operators of street cars and motor coaches when 
the men are placed out on the job entirely on their own. If on-the-job 
learning is accomplished in a very short period then it would seem likely 
that additional time devoted to formal training would be very worth 
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while. On the other hand, if on-the-job learning takes place at a rather 
slow rate it would appear that other solutions would be necessary, such 
as continued supervised performance. The following analysis was under- 
taken in order to obtain information relative to the course of on-the-job 
learning as indicated by the change in the number of accidents incurred 
as experience is increased. 


Methods and Procedures 


Accident records were obtained for each of 60 street car motormen and 
34 motor coach operators for the first 17 months of their employment 
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after they had completed their formal training. In most cases this train- 
ing was accomplished in two to three weeks and in no case extended over 
more than a month. Since traffic conditions, and hence accident rate, 
show a certain seasonal variation the cases were selected covering hiring 
over a 6 months period, approximately an equal number of men being 
selected foreach month. A month at any point in the sequence of months 
analyzed would then vary in exposure for different men and thus some 
control of seasona. variation would be effected in so far as the course of 
learning is concerned. 


Results 


In the accompanying chart is given the average number of accidents 
for successive months incurred by the street car motormen and by the 
motor coach operators. Because of the discrepancy in the number of 
accidents incurred in the two types of jobs, two different values were used 
on the frequency ordinate in order to bring the curves adjacent for pur- 
poses of comparison. It is apparent that with increased experience on 
the job accident rate is reduced. The reduction is approximately at the 
same rate for both groups. 

In shape, the curves are similar to ordinary learning curves, most of 
the improvement occurring in the earlieir stages. Most of the improve- 
ment as measured by a reduction in number of accidents is accomplished 
by the sixth or seventh month. By the end of the seventeenth month 
period the accident rate is reduced by more than half. However, even 
at the end of this rather long period accident rate is still falling although 
the rate of reduction is small. 


Conclusions 


A considerable amount of on-the-job learning as measured by a re- 
duction in accident rate is shown by new street car motormen and motor 
coach operators. This improvement follows the ordinary course of most 
learning functions. Some six or seven months are required before the 
rate of improvement falls to a minimum. However, even at seventeen 
months a true plateau in learning is not reached and improvement is 
still manifest. It is therefore unlikely that even doubiing the length of 
the present type of formal training program of street car motormen and 
motor coach operators would significantly improve their early perform- 
ance on the job as measured by the accident criterion. Rather it would 
seem much more profitable to change the nature of the training program, 
basing it upon a sound psychological analysis of the types of abilities re- 
quired by the complex situation in which the vehicles are operated. 


Received December 23, 1946. 





Selection of Aircraft Engineering Draftsmen 
and Designers 


Harry W. Case 
University of California at Los Angeles 


The problem of obtaining adequately trained technical personnel has 
confronted the engineering departments of the aircraft industry for many 
years. At the onset of the war period when the volume of business en- 
abled the industry to offer pay rates as attractive to the beginner as those 
offered by other engineering fields, the excessively high failure and turn- 
over rate indicated to one management that factors other than pay might 
be responsible. The results of an investigation made to determine the 
factors responsible for this condition indicated among other things that 
many of the engineering college graduates hired lacked the specific ability 
to do drafting and failed to develop basic design and layout skill. A job 
analysis revealed that in part the brightness and ability of the individual 
to think in concepts involving spacial discriminations and relationships 


bore directly upon his success or failure in the job. 


Selection of Tests 


As an index of the brightness of the individual, the lead of other in- 
vestigators! in predicting the success or failure of engineering college 
students was followed and the Otis Self-Administering Test of Mental 
Ability, Higher Examination, was selected as the method of measure- 
ment. Some investigators have shown a relationship between the Min- 
nesota Paper Form Board Test and drafting ability.2 However, it was 
believed that this test did not adequately meet the requirements of the 
situation. Therefore, a test * was developed which it was felt would more 
adequately measure the factor of space relations utilized in the specific 
type of work. 

Because the program undertaken was based upon a projected long 
term improvement of the hiring techniques and had to avoid disrupting 

1 Holliday, F. An investigation into the selection of apprentices for the engineering 
industry. Occup. Psychol., 1940, 14, 69-81. 

* Laycock, 8. R., and Hutcheon, N. B. A preliminary investigation into the prob- 
lem of measuring engineering aptitude. J. educ. Psychol., 1939, 30, 280-289. 

*Case, H. W., and Ruch, F. Survey of space relations ability. Los Angeles: Cali- 
fornia Test Bureau. 1944. 
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the morale of the engineering department, the same relative hiring pro- 
cedures were followed with the exception that all applicants were given 
the tests after hire. The study was undertaken in 1943 but it was not until 
1946 that sufficient cases were obtained so the criteria of job proficiency 
and the test scores could be correlated and the degree of predictive. re- 
lationship determined. 


Selection of Criteria 


The selection of criteria which would give an accurate evaluation of 
the individual’s degree of success or failure in drafting and design pre- 
sented a problem due to the conditions existing at the time when the 
study was conducted. Rates of pay, for example, were increasing rapidly 
within the industry as a result of a decreasing labor market which re- 
sulted in a constantly spiraling hiring rate. Similarly, the activities of 
Selective Service prevented terminations from becoming a reliable meas- 
ure of job failure. And finally, the numerous revisions brought about in 
the company rating scale due to extraneous conditions prevented this from 
being used as a measure of work success. 

After considerable investigation of the possibility of adapting various 
criteria, each of which appeared to give an unrelated aspect of the in- 
dividual’s performance, it was decided that the most effective method 
would be to make no check of the individuais during the period of con- 
flict and to delay an evaluation until the industry had reached a more 
stable period. As a result, in March of 1946 the following procedure was 
established for determining the drafting and design ability of the indi- 
viduals included in the study. 

The names of those employees engaged in drafting or design work who 
had been given both the Otis Test and the Survey of Space Relations 
Ability Test were compiled into lists prepared according to the supervisor 
for whom the employee worked. Since the specific type of design or 
drafting work under each of the supervisors was concerned with different 
components of the airframe, it was decided to use the simplest scale pos- 
sible. Each supervisor was requested to evaluate the ability of the in- 
dividuals on his list in terms of: Poor, Below Average, Average, Good, 
or Very Good with the provision that he could define the limits of the 
man’s ability in the specific subdivision of the scale in terms of a plus or 
minus. In order to enable the data to be used for purposes of correlation, 
the numbers | to 5 were assigned to the various components of the rating. 
Where a plus or a minus was indicated, the point was taken as half way 
between the two numbers. Thus, Average équalled 3.0 while Average 
plus equalled 3.5. Relatively few of the cases were assigned plus or minus 
values by the supervisors. 
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Due to the factor of work loadings there is always some degree of 
internal transfer in progress within an aircraft engineering department. 
In order to offset this and provide assurance that the ratings would be 
comparable, the supervisor was requested to rate only those individuals 
who had worked for him a sufficient length of time to allow him to make 
an accurate determination of their ability and who had actually demon- 
strated in their work for him the ability being considered. Since the 
industry at this time was faced with the possibility of reductions in the 
total technical personnel and since the supervisors did not know the 
specific purpose of the rating, each supervisor made a careful and accurate 
evaluation. A representative of the engineering personnel office remained 
with the supervisor during the entire time the evaluation was made. The 
lists were distributed and collected by this representative at the time of 
the rating to prevent the supervisor from retaining a copy of the ratings 
for future reference. 

Because of the nature of the work, some men engaged in designing 
were also required to do the actual drafting. The first evaluation of job 
performance was made with respect to drafting skill. The supervisors 
were asked to distinguish carefully between drafting skill and design 
ability and to consider only the output of their employees with regard to 
the quantity and quality of the drafting being done. A period of six 
months elapsed before similar ratings were obtained for design ability to 
prevent the supervisors’ judgments being influenced by their former 
evaluation of the men being considered. 


Results 


Table 1 shows the correlations and intercorrelations existing between 
the criterion of drafting skill and the Otis Test, and the Survey of Space 
Relations Ability. 

Table 2 shows the correlations and intercorrelations between the cri- 
terion for design ability and the Otis Test, and the Survey of Space Re- 
lations Ability. 

The relative lowness of the correlation of the Otis Test with the draft- 
ing skill and design ability may be accounted for in part on the basis of the 
homogeneity of the group which in the majority of instances had been 
pre-selected by their training in engineering colleges. It is believed that 
had no pre-selection taken place these correlations might have been some- 
what higher. 

In order to test the normality of the distributions 8; and B. were 
computed for the criterion of drafting skill, the criterion of design ability, 
and the distributions of scores for the Otis Test and the Survey of Space 
Relations Ability Test. These are shown in Table3. It will be seen that 
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the curve for the scores comprising the drafting criterion is somewhat 
negatively skewed and leptokurtic in shape, while the design criterion 
curve is slightly negatively skewed and mesokurtic in shape. The greater 
amount of skewness on the part of the drafting criterion when compared 
with the design criterion may be accounted for on the basis that even 
though a man was a poor designer in many instances it was still possible 


Table 1 


Showing the Correlations and Intercorrelations between Supervisors’ Judgments of 
Drafting Skill and the Otis Self-Administering Test of Mental Ability 
and the Survey of Space Relations Ability. N = 150. 
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Table 2 


Showing the Correlations and Intercorrelations between Supervisors’ Judgments of 
» Design Ability and the Otis Self-Administering Test of Mental Ability 
and the Survey of Space Relations Ability. N = 111. 
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Table 3 


Showing 4; and £; for the Distributions of Evaluation Scores of Drafting Skill, Evaluation 
Scores of Design Ability, Scores on the Otis Self-Administering Test 
of Mental Ability, and the Scores Obtained on the 
Survey of Space Relations Ability * 














*N = 150 except for Design Ability where N = 111. 


for him to be retained as a good draftsman; while a man who was neither 
a passable designer nor draftsman was almost sure to have been elimi- 
nated from the company during the three year period. The skewness of 
both groups of data is to be expected when the men being studied have 
remained with the company a sufficient length of time to justify their re- 
tention in the department; undoubtedly a part of the lower portion of the 
curve had been eliminated by termination and Selective Service losses to 
the Armed Forces. 

The tendency toward negative skewness as shown in both the distri- 
butions of the Otis Test and the Survey of Space Relations Ability appears 
to indicate that either a selective factor may have been operating in the 
retention of the better men during the three year period in which the 
tests were given or that most of the cases were graduates of engineering 
colleges which tended to eliminate the lower section of the distribution. 
It is believed that the predictive value of the tests was not lessened be- 
cause of this condition since under the procedures established the tests 
were predicting only skill and ability in the actual operations being evalu- 
ated. 

In order to determine the degree of relationship existing between the 
two measures of the criteria and to ascertain the degree to which the 
factor of “halo” was operating in the ratings of the supervisors, the cor- 
relation between the two sets of ratings was obtained. This was found 
to be .52 with a o, of .069. It is believed that this indicates that the two 
ratings had very little relationship other than the fact that many of the 
skillful designers were also skillful draftsmen. 

The relatively low intercorrelation obtained between the Otis Test 
and the Survey of Space Relations Ability Test when used in predicting 
design ability and drafting skill would appear to indicate that the two 
tests are measuring separate areas of aptitude. 
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The standard errors of estimate of the coefficients of correlation when 
determined by Fisher’s test of significance ‘ and utilized in conjunction 
with Guilford’s tables ° indicate that the correlations in Tables 1 and 2 are 
either significant or very significant when considered under the limits de- 
fined as determining these terms. 


Summary and Conclusions 


1. A definite and positive relationship appears to exist between the 
aptitude as measured by the Survey of Space Relations Ability and de- 
veloped skill in certain types of aircraft design and drafting. 

2. A positive relationship of design skill with brightness or general 
capacity as measured by the Otis Self-Administering Test of Mental 
Ability, Higher Examination, appears to exist. 

3. The specific abilities as predicted by the Otis Self-Administering 
Test of Mental Ability and the Survey of Space Relations Ability are 
not as important in the development of drafting skill as they are in the 
development of design ability within the particular type of design and 
drafting done in the airframe industry. 

4. On the basis of the results obtained it appears possible to utilize 
certain types of tests in the selection of engineers to do aircraft design and 
drafting, since in addition to technical knowledge the aptitudes or capaci- 
ties as measured by the Otis Self-Administering Test of Mental Ability 


and the Survey of Space Relations Ability appear related to drafting and 
design skill. 


Received August 14, 1947. 
Early publication. 
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The Problem of Resistance to Change in Industry 


Robert N. McMurry 
Robert N. McMurry & Co., Chicago, Illinois 


A medium-sized Middle Western manufacturing company recently in- 
stalled a new and greatly improved wage incentive plan at a cost in excess 
of $20,000. The work was done entirely by outside engineers. These 
engineers did an excellent job technically and management was satisfied; 
the only difficulty was that three weeks later the new plan had been com- 
pletely abandoned and the investment of $20,000 had been totally lost. 
Why was this? 

Industrial progress finds one of its greatest handicaps in the frequent 
resistance of both management and workers to change of any sort. This 
is especially marked if the change is introduced without proper advance 
notice and explanation to those whom it will affect. Even innovations 
which are obviously advantageous are often objects of attack. Where 
the changes threaten either the status or job security of either workers 
or management, their reaction is certain to be quick and viclently nega- 
tive. In those organizations where employee and supervisory insecurity 
is present, even minor revisions of policies or procedures may evoke pro- 
foundly disturbing reactions among individuals and groups. An effort 
is made at once either to block the introduction of the new methods or to 
discredit them after their installation and force their removal. 

Even ordinarily honest and loyal workers and executives will some- 
times lie, misrepresent, and engage in outright sabotage of the new pro- 
cedures, so bitter are the antagonisms aroused. Nor are these mani- 
festations limited to individuals. Large groups of employees may react 
with equal violence when their security or status is at stake. An example 
of this is the frequent reaction of white employees to the introduction of 
Negroes into the work force. The latter are a threat both to their secur- 
ity (the Negroes are considered as competitors in the labor market) and 
to their status (the whites resent being grouped with the Negroes whom 
they regard as of lower status). Actually the Negroes may be highly 
desirable as employees and may contribute to the welfare of the organi- 
zation as a whole. Nevertheless, their introduction is violently resisted. 
While it is customary to attribute these resistances to the reluctance of 
people to change well-established habits, it is probable that the chief 
causes lie far deeper. 
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The principal root of this hostility to anything which threatens secur- 
ity or status is fear (frequently reinforced and rationalized by accumulated 
resentments and rivalries). The hostility which this fear generates, in 
turn, leads to attacks upon the sources of the anxiety. The amazing 
feature of these attacks is that many of them come from employees who, 
because of their rank or long service, have no real ground to fear for 
either their status or security. Nevertheless, quite without adequate 
justification, many feel extremely insecure. This is because deep-seated 
fears exist within the individual himself. Everyone knows fear. Even 
the infant is prey to this emotion because it is innate, inborn. Further- 
more, everyone is constantly faced by very real and tangible grounds for 
anxiety and insecurity. Nature is cruel. The law of the fang prevails 
to a greater extent than many recognize. The world at large is no place 
for the weakling. Even business is highly competitive. Rivalries and 
conflicts exist within nearly every business. Realistically regarded, life 
is far from a bed of roses for most people. 

Hence, the real and justifiable fears which beset the average person are 
legion. There is always somewhere in the future the danger of economic 
disaster, of another depression with its threat to savings, to the home, 
to security. Everyone is faced with the problem of old age and its at- 
tendant likelihood of illness, suffering, and dependence. Even in youth 
and the prime of life, there is always the immediate possibility of illness, 
of accidents, and the inevitability of death. Nor are these real grounds 
for fear confined to the individual himself. There is also the fear of mis- 
fortune to loved ones; a fear, again, which the war years have greatly 
stimulated. Finally, there is almost always the more or less immediate 
danger to everyone of loss of his job or of being displaced or demoted, 
with its attendant loss of prestige, “status,” and earnings. 

It must be kept in mind that the average rank-and-file employee in 
industry today, unlike his counterpart of fifty years ago, does not even 
own his own tools. The cnly commodity he his to sell is his labor or some 
readily replaceable skill. He is, therefore, much more dependent eco- 
nomically upon his job tenure than was the case with the man who could, 
if necessary, set up in business for himself. In addition, the longer he 
has remained with a particular company, often the greater his difficulty 
in getting work elsewhere. This is because the bulk of the routine jobs 
in industry today do not require great skill; certainly not in the sense of 
the old-time master craftsman. Consequently, the employee who has 
spent ten to twenty-five years in a particular line of work has gained 
little that is saleable, but has lost his youth, his vigor, and his adaptability 
to new lines of endeavor. He has given the best years of his life and 
often has little of vocational value to show for it. 

It is because of this that there is such a feeling of need for some sort 
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of job security among most working people (whether it be seniority or 
some other form of property rights in the job). For the same reason, 
anything which threatens job security or hard-won status, such as it is, 
is desperately feared and resented. 

Unfortunately, these real and understandable grounds for fear are not 
the only ones which contribute to employee insecurity. Nearly all per- 
sons also suffer to a greater or less degree from neurotic anxieties and 
fears which have no basis in reality whatever. Among these latter are 
the insecurities which grow out of the passive dependent tendencies of the 
emotionally immature. Others grow out of the repeated rejections to 
which the individual may have been subjected during childhood or youth. 
Still others have their origin in an over-strict conscience, resulting from 
too rigorous an up-bringing. (Nearly everything such persons do makes 
them feel guilty.) Likewise, many neurotic anxieties have their basis in 
buried but powerful hostilities toward loved ones and others which pro- 
duce a free-floating sense of guilt and anxiety and lead to constant 
worrying. 

Many of these fears, regardless of their nature, are too painful to be 
faced; they cannot be lived with. Hence, they have been thrust out of 
the center of the individual’s consciousness; they are vaguely present on 
the periphery. They are not entirely repressed; mereiy out of sharp 
focus. Nevertheless, they continue to exist in a latent state, their power 
to disturb quite undiminished. Their presence constantly disturbs the 
individual’s emotional equilibrium and makes its balance a precarious 
one. When any new challenge to his status or security occurs, it accen- 
tuates his existing anxieties and feelings of insecurity. These added 
fears almost inevitably upset his emotional balance. His latent fears, 
having been reinforced, once more threaten to become painfully conscious. 
This must be avoided at any cost. Hence, he has powerful incentive to 
rid himself of the source of danger to his status and security. 

Fears which even trivial changes arouse are often so powerful that they 
are overwhelming. The fear thus induced is so real and poignant that 
it may even induce a state of actual panic. At this point, the victim 
ceases to be entirely rational, in spite of the fact that he may appear out- 
wardly calm and possessed. If it appears politically expedient, he may 
even indicate a high degree of favor for the very changes which have 
excited his anxiety. Nevertheless, he will stop at nothing to save him- 
self. (This attitude of superficial acceptance of an innovation is some- 
times barefaced hypocrisy; more often the individual’s fears are so acute 
that he cannot take an open stand against anything.) 

Because of the highly emotional character of these resistances to 
change, a direct, logical presentation of the merits of the change is often 
futile. The more they are discussed, the more violent the anxieties they 
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are likely to arouse and the greater the individual’s need to discredit 
and eliminate them. Even worse, however, is to attempt to explain to 
him the sources in himself of his «:itagonisms to the projected change. 
This only makes him react more violently because it mobilizes fresh anxi- 
eties within him and breaks down his defenses against them. It not only 
forces him to face his naked fears himself, but makes him aware that 
others know his weaknesses. This adds to his anxieties—and to his 
aggressiveness. . 

In view of the foregoing circumstances, great caution must be exer- 
cised in making any changes in organization or methods, even those which 
are obviously and badly needed. It will never be possible wholly to 
eliminate anxieties in workers and supervisors with consequent resist- 
ances to change for its own sake and as a threat to their status or security. 
Hence, it is essential that any modification of product, procedures, or- 
ganization or policies which may affect status or may be interpreted as an 
implied threat to job security should be considered carefully before it is 
made. It is particularly important that its implications be considered 
from the standpoint of the insecurities and possible anxieties of the em- 
ployees affected. It must always be kept in mind that, regardless of 
the facts, those who will be affected may interpret it somehow as a threat 
to them and respond accordingly. 

Sometimes it is better, in the long run, not to make moderately needed 
changes because the disturbance they will occasion may be more costly 
in the end than will a continuance of the status guo. In those cases, where 
there is some real threat to an employee’s status and security in the 
change, it will prove wiser and cheaper to “kick him upstairs” to some 
“advisory” job (thus retaining his status and job security), rather than 
risk the organization-wide disturbance of morale which his demotion or 
other “‘face’”’ destroying course of action might bring with it. It is en- 
tirely possible for one individual, if sufficiently aroused, to disrupt the 
morale and smooth functioning of an entire segment of a business by 
pointing out that what has happened to him could happen to many others. 

If it is finally decided that a change must be made, it is wise to move 
very slowly. Only one innovation should be introduced at a time; ample 
warning must precede it, and a full statement must be given of the reasons 
for it and the benefits which are expected to result from it. If this is done, 
there is less likelihood that the emotional equilibrium of the individual 
or group will be upset. Informing the employee in advance will do much 
to allay the fears that a sudden change might otherwise arouse. There 
will always be some anxiety, but this will help to minimize it. 

Further to allay the fears of those affected, they should be given max- 
imum opportunuty to participate in the discussion and planning of pro- 
posed changes in advance of their introduction. They should also have 
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some voice in deciding how and when they will be made effective. This 
gives them a feeling of having had at least some part in the determination 
of their own destinies. This tends to minimize their feelings of helpless- 
ness and consequent anxiety in the face of the changes. At the same 
time, it will give them a better insight into, and understanding of, the 
conditions calling for the innovations and the way in which they will be 
of personal benefit to those affected. This, in turn, will allay their anxi- 
eties and discourage the development of resistances and hostilities. 

Finally, if a program calling for other than minor changes is to gain 
acceptance and use, it is imperative that ready outlets be provided for the 
expression and relief of the hostilities which will almost inevitably arise. 
Under the best of conditions, some of those affected will be disturbed and 
unhappy. Therefore, it will be necessary to provide these employees 
with easily accessible facilities to ‘talk out” their anxieties and resent- 
ments from time to time. They will not be aware that it is largely fear 
which stimulates their aggressions and needs for reassurance; all they will 
know is that having talked about them, they will feel better. Periodic, 
informal meetings between small groups of the affected employees and a 
representative of top management are to be recommended for this pur- 
pose. He must be patient and sympathetic and give the employees’ 
complaints about the changes, no matter how absurd or unreasonable, a 
fair hearing. This thus provides a release for their accummulated ten- 
sions. Such meetings, by bringing resistances out into the open, have 
the advantages both of relieving the rancor of the disgruntled worker or 
supervisor before he has had a chance seriously to disrupt departmental 
morale, and of reviewing the worthwhileness of the new procedures and 
methods. Sometimes it will be indicated that even further changes are 
necessary. 

‘Lhe resistance of workers, supervisors, and executives to change is 
irritating and often frustrating. This is especially true when the im- 
provements are designed specifically to help them and the company as a 
whole. However, if it is recognized that it is their basic anxieties and 
insecurities which underlie and stimulate their lack of cooperation, not 
sheer stubborness, selfishness and stupidity, a more understanding and 
sympathetic view can be taken of the problem. These resistances will 
probably never be totally overcome, but through the awareness of the 
basic fears and the application of the principles outlined above, an in- 
formed and constructive course of action can be undertaken to insure the 
acceptance and continued use of the new procedures and policies, even 
though they may incorporate a number of radical innovations. 

Received August 22, 1947. 
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Norms for Graduate School Business Students on the 
Minnesota Vocational Test for Clerical Workers * 


Edward K. Strong, Jr. 
Stanford University 


A graduate student wrote as follows, “I am a little concerned about my 
slowness in thinking in terms of numbers. What should my major be?” 
He had taken a battery of tests and the counselor had reported among 
other things as follows: ““He does seem to be a little slow in his quantita- 
tive thinking or in thinking in terms of numbers as indicated from test 
scores.” 

The test basis for the above is that the student had a score of 115 in 
number checking and 147 in name checking on the Minnesota Vocational 
Test for Clerical Workers; and that these scores had been compared with 
the norms for clerical workers, giving percentiles, respectively, of 22 and 
73. Such information is of value and has its place in counseling. But 
to worry a graduate student about his quantitative thinking because he 
has a percentile of 22 compared with clerical workers is very questionable, 
considering that his ACE percentile was 83 and his grades in graduate 
courses were 70 per cent A’s and 30 per cent B’s.! 

If his number and name checking scores had been contrasted with 
“adults gainfully employed” as given in the Manual (1) resulting in per- 
centiles, respectively, of 85 and 94, it would indicate quite a different 
estimate of the man—a man who had no intention of being a clerical 
worker but rather a university instructor of business subjects, preferably 
marketing. 

On looking into this matter, the writer was unable to find any norms 
for the Minnesota Vocational Test for Clerical Workers applicable to 

* The occasion for this article was the faulty interpretation of scores on this test 
by a counselor, as reported below, which unnecessarily disturbed an accepted candidate 
for the Ph.D. degree in the Stanford University Graduate School of Business. In 
addition to reporting new norms for the test applicable to university business students 


and presumably also to well educated business men, the question is raised as to what 
the test measures. 


1 Toward the end of the report the counselor added, “The Minnesota Vocational 
Test for Clerical Workers seems also to reveal superior clerical aptitude as his percentile 
for checking names was 73.” It is far from clear why he would be a superior clerical 
worker because of his percentile of 73 in name checking if at the same time his percentile 
in number checking was only 22—so low as to rate him slow in quantitative thinking. 
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business men or students headed for business. Data on 168 cases are here 
reported as an aid in interpreting the test scores. 
Norms for Graduate School Business Students 


Table 1 gives the distribution of scores on number and name checking 
for 168 men students in the two year Graduate School of Business at 


Table 1 


Distribution of Scores and Percentiles in Number and Name Checking of 168 
Students in Graduate School of Business 

















siver Number Checking Name Checking 
Score Number Percentile Number Percentile 

190 1 99 4 98 
180 2 98 5 95 
170 5 95 13 87 
160 8 90 19 76 
150 7 86 12 68 
140 27 70 24 54 
130 23 57 15 45 
120 27 40 27 29 
110 28 24 16 20 
100 15 15 12 13 
90 15 6 1l 6 
80 6 2 6 2 
70 2 1 2 1 
60 2 1 1 1 
50 1 1 

Mean 126.7 135.3 

Sigma 24.1 28.9 





Stanford University. Table 2 gives percentile norms for adults gain- 
fully employed, 9th grade students and clerical workers, as reported in the 
Manual, and for the 168 business students. 

It is evident that the business students and clerical workers are supe- 
rior to adults gainfully employed and to 9th grade students in both num- 
ber and name checking. The means for the first two groups fall in the 
90-100 percentile range of the last two groups. 

The business students are inferior to clerical workers in number check- 
ing by about 10 percentile points but they are superior to clerical workers 
in name checking to about the same degree. 

Our business students are proportionately better in name checking 
whereas the other three groups are better in number checking. They are in 
this respect more like women clerical workers than men clerical workers 
for the former are also superior in name to number checking. Ninth 
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Table 2 
Norms for Number and Name Checking for Graduate School Business Students * 











Number Checking Name Checking 





9th Business Clerical Business Clerical 
Centiles Adults Grade Students Workers Adults Gunde Students Workers 





100 179 144 195 198 198 148 195 196 
121 116 160 176 122 114 173 166 
108 107 147 162 107 101 164 154 
97 99 140 151 96 94 152 143 
90 95 132 141 86 88 144 134 
83 90 126 135 78 83 135 126 
75 85 120 129 69 78 126 119 
67 80 114 121 60 73 121 112 
57 73 107 114 48 66 111 105 
45 63 97 104 34 54 97 97 
7 42 60 68 0 30 50 62 
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* Norms for adults, 9th grade students and clerical workers as given in the Manual (1). 


grade girls have about the same scores on both tests and women adults 
gainfully employed average only slightly higher on name than number 
checking so that from available data we cannot say that superiority in 
name over number checking is a feminine characteristic. 

From data in the Manual (1) it appears that accountants and book- 
keepers (with a 50 percentile of 144), bank tellers (137), and clerical 
workers (135) are superior to business students (126) in number checking, 
while routine clerical workers (124), shipping and stock clerks (104) are 
inferior to business students. But in name checking business students 
(135) are superior to all other groups. They are only slightly superior 
to bank tellers (134) and general clerical workers (131), appreciably 
superior to accountants, bookkeepers (127) and employed clerical workers 
(126) and very superior to routine clerical workers (118) and shipping 
and stock clerks (102). 

The 168 business students are composed of three groups: 53 students 
tested in 1935, 37 tested in 1945 and 78 tested in 1947. The last group 
is markedly superior to the two earlier groups.? See Table 3. The 1947 
class is composed almost entirely of returned veterans and is supposedly 
a superior group in that the entire class was selected from fully twice that 
number of applicants. Only one of four sections was tested on the Min- 
nesota Clerical Test. Their mean aptitude score (Ohio State Psycho- 


2 The mean of 133.8 is superior to 117.7 by a critical ratio of 3.9; it is superior to 
124.3 by a critical ratio of 2.3. The mean of 146.4 is superior to 124.1 by a critical ratio 
of 4.7; it is superior to 127.6 by a critical ratio of 3.5. 
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Table 3 


Mean Scores of Three Groups of Graduate School of Business Students. in 
Number Checking and Name Checking 

















Number Checking Name Checking 
Three Groups 
of Students Mean Sigma Mean Sigma 
53 in 1935 117.7 22.1 124.1 26.5 
37 in 1945 124.3 19.0 127.6 26.4 
78 in 1947 133.8 24.6 146.4 27.4 





logical Examination) is 86.5. (A second section averaged 87.8.) But 
these scores are only slightly superior to the 1935 class which had a mean 
score of 82.7 (Thorndike Intelligence Examination for High School 
Graduates). Although tested on different intelligence tests the scores 
are supposedly equated so that scores on the two tests are comparable. 
The difference in intelligence test scores, granted that the scores are 
comparable, is too slight to explain the marked difference in number and 
name checking. No other explanation is apparent, unless it is that the 
veterans are more serious and tried harder to make good scores. We had 
the feeling that the 1945 class did not try very hard but had no such 
feeling regarding the 1935 class. 


What is Measured by the Minnesota Clerical Test? 


Available data indicate that the Clerical Test measures something 
that enters into successful performance of many clerical activities and 
that employed clerical workers score much higher on the test than the 
general population, see Table 2. We are concerned here, however, not 
with the validity of the test but with the question of what it is that is 
measured by the test. 

Small Differences. According to the Manual (1) “the Minnesota 
Clerical Test is measuring an aptitude which is related positively to the 
abilities to discriminate small differences rapidly, to observe and compare, 
to adjust to a new situation, and to give attention to a problem.” 

Drake’s Visual Perception Test A (3) would seemingly measure the 
qualities claimed for the Minnesota Clerical Test. The test consists of 
100 pairs of concentric circles, the larger circle being about 34” in diameter 
and the smaller circle 4% "in diameter. The requirement is to “check 
every small circle that is not exactly in the center of a large circle.”” At 
the beginning almost anyone can perceive the incorrectly centered small 
circles. ‘As one works toward the end of the test, he encounters smaller 
and smaller differences, until he reaches an area beyond the limit of his 
perceptual ability and in which he can only guess.” 
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Is the process of discriminating between small differences in the cen- 
tering of one circle within another similar to discriminating between small 
differences in pairs of numbers or pairs of names? And are the abilities 
to observe and compare, to adjust to a new situation, and to give attention 
to a problem similar in Drake’s Test and the Minnesota Clerical Test? 

Two scores are obtained from both these tests. On the Drake Visual 
Perception Test A we have time in seconds required to finish the test and 
number of errors made; and on the Minnesota Clerical Test we have 
number attempted and number of errors made. Drake states in a letter 
that the two scores on his test should not be combined but each should be 
taken into account in employing applicants. The two scores correlate 
—.45, meaning that errors increase with speed. If, however, the two 
scores are combined, giving equal weight to each, such scores correlate .23 
with R-W number checking scores and .07 with R-W name checking 
scores, based on the records of 69 students in the Graduate School of 
Business. 

It seemed possible that poor eye-sight might explain in part the errors 
made in the two tests. However, a correlation of only .07 was obtained 
between error scores on the Drake test and total error scores on the 
number and name checking tests. (The correlation between error scores 
on the number and name checking tests is .45.) 

A third method of comparing scores on the two tests is to correlate 
time required on the Drake test with number attempted, not R-W, on 
the Minnesota Clerical Test. This procedure gives a coefficient of .04 
between the Drake test and number checking and .29 between the Drake 
test and name checking. The three methods of comparing the two tests 
yield five correlation coefficients ranging from .04 to .29 and averaging .14. 

Is it appropriate to say that the Minnesota Clerical Test measures 
“ability to discriminate small differences rapidly” when it does not cor- 
relate with the Drake Perception Test which is “‘a suitable test,”’ accord- 
ing to its author, “for the discrimination of small differences’’? 

In this connection, it is well to keep in mind that the two parts of the 
Minnesota Clerical Test correlate only .66, according to the Manual. 
We obtain correlations of .65 with our 1935 group, .33 with our 1945 
group, and .66 with our 1947 group. Seen ingly the only difference be- 
tween the two parts of the test is that one part involves numbers and the 
other part, names. Apparently this seemingly slight specific difference 
affects the respective scores very greatly in comparison to all the other 
elements that are common to the two parts. 

Intelligence. For a heterogeneous group, the authors report that 
number checking correlates .47 and name checking .65 with intelligence. 
For homogeneous groups, “for whom the test was designed,” such as 
employed clerical workers, university business students, and high school 
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commercial students, ‘the correlation between intelligence and number 
checking is about .12 and for name checking about .37.” With our 
graduate students in business intelligence correlates about .18 with num- 
ber checking and about .39 with name checking. (The correlations with 
number checking are .03, .22, .18 and .29 with different groups and with 
different intelligence tests; with name checking the coefficients are .55, 
.67, .07 and .29 respectively.) 

The authors have minimized the relationship between the Minnesota 
Clerical Test and intelligence stressing the low correlations with homo- 
geneous groups and explaining away the higher correlations with hetero- 
geneous groups by stating that “even in the heterogeneous groups, the 
Clerical Test is relatively unique with respect to academic ability, since 
the coefficients of alienation, indicating lack of relationship, are .88 for 
clerical number checking and intelligence and .76 for clerical name check- 
ing and intelligence.” 

Granted that the groups considered here are truly heterogeneous and 
homogeneous with respect to each other, then the low correlations with 
the latter groups are substantiation of higher correlations with the former, 
for lower correlations are to be expected among homogeneous than among 
heterogeneous groups. 

What interpretation suall we attach to a correlation coefficient of 
.65? The Manual states, as given above, that name checking and in- 
telligerice are “relatively unique,” although they correlate .65, having a 
coefficient of alienation of .76. At the same time the Manual points out 
that the two parts of the test ‘‘are closely related,” correlating .66, which 
has a coefficient of alienation of .75. The Manual also gives as proof of 
the validity of the test that there is a correlation of .58 between commer- 
cial teacher’s ratings and the test. Here the coefficient of alienation is 
.81. Either we should explain away all correlations of .65 on the ground 
that the accompanying coefficient of alienation of .76 is very high or we 
should accept the correlation of .65 as significant; but we should not do 
both. Either both name checking and intelligence and name checking 
and number checking are “relatively unique” or they are both ‘‘closely 
related.” 

Is it not better to recognize first, that the Minnesota Clerical Test 
does correlate significantly with intelligence and, second, that scores on 
the test aid appreciably in differentiating clerical workers from workers 
in general. Furthermore, with a homogeneous group of clerical workers, 
scores on the clerical test and on an intelligence test correlate relatively 
little and both types of scores add to the differentiation. 

Numerical and Verbal Factors. The authors of the Minnesota Clerizal 
Test point out that number checking ‘measures more of a numerical fac- 
tor” and that name checking “measures more of a verbal factor.’”?’ The 
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former correlates .51 with a test of verifying arithmetical computa- 
tions; the latter correlates .45 with speed of reading (reduced to .30 when 
intelligence is partialled out) and .65 with a spelling test. in contrast 
the number checking test correlates only .09 with speed of reading. 

The Manual reports correlations between grades in accounting of 167 
university pre-business students and number and name checking of .47 
and .49, respectively. With our graduate students the coefficients are 
.29 and .23, respectively. These data are based on a single course in 
managerial accounting where the emphasis is upon the use of accounting 
far more than upon computations, although computations are involved. 
This probably explains the difference in the size of the coefficients. The 
point here, however, is that number and name checking correlate equally 
well with accounting in both comparisons. Maybe the verbal factor of 
reading the textbook and listening to lectures is as important in a course 
of accounting as the numerical factor in handling figures. We need more 
information on this matter. 


Summary 


1. Norms for the Minnesota Clerical Test are supplied for graduate 
students in business, norms that can be used for well educated business 
men since these students become business men. 

2. Data are supplied which question the statement that the Min- 
nesota Clerical Test measures “the ability to discriminate small ‘differ- 
ences rapidly.” Seemingly it would be better to claim that the test 
measures ability to see differences in numbers and names. 

3. Available data sugge*i that the test correlates significantly with 
intelligence but among a homogeneous group of clerical workers for whom 
the test is designed the correlation between number and name checking 
and intelligence is relatively low. 

Received July 28, 1947. 
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Validation of Naval Aviation Cadet Selection Tests 
Against Training Criteria * 


Donald W. Fiske 
University of Michigan 


During the war, much material of theoretical and technical interest 
to professional psychologists could not be published. The end of the war 
placed upon military psychologists the responsibility for adding this 
material to the fund of reported empirical findings. With this purpose 
in view, this paper summarizes a part of the work performed by the 
Aviation Psychology Branch, Division of Aviation Medicine, Bureau of 
Medicine and Surgery, Navy Department. To be specific, this report 
will present some of the evidence concerning the usefulness of naval 
_ aviation cadet selection tests and will discuss some of the problems en- 
countered in this testing program. The activities of naval aviation 
psychologists have been described elsewhere (1, 2, 3, 4), and the history 
of naval aviation selection policy is beyond the scope of this paper. 


The Tests 


1. Wonderlic’s Personnel Test (PT). This group intelligence test is 
an abridged form of the well-known Otis Self-Administering Test, with 
vocabulary, directions, arithmetic reasoning, and other types of items. 
Three forms were used by the Navy, each form having fifty items and a 
time limit of twelve minutes. Since the PT had relatively low reliability 
(perhaps because of its brevity), provision was made for the administra- 
tion of a second form to those applicants who made unsatisfactory scores 
on the first testing. This policy was designed to minimize the number 
of rejections resulting from errors of measurement. While no systematic 
study of repeat reliability was carried out, available data indicated that 
the correlations between different forms of the test averaged close to .70 
in the population of applicants for flight training. 

* Prepared for the Aviation Psychology Branch, Bureau of Medicine and Surgery, 
Navy Department. The following people were actively associated with the studies 
reported in this paper: Captain John G. Jenkins; Commanders Jack W. Dunlap and 
E. Lowell Kelly; Lieutenant Commanders Donald W. Fiske, Martin D. Kaplon, and 
George A. Kelly; Lieutenants John G. Darley, Berna Johnson, Willis C. Schaefer, and 
Charles L. Vaughn; Lieutenant (j.g.) John B. Carroll. The opinions contained herein 
are those of the author and are not to be construed as official or as reflecting the views 
of the Navy Department or the naval service at large. 
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Even more serious was the evidence for the inequality of the various 
forms. In random samples of several hundred cases, the mean score on 
Form F was one point above that for Form E. Moreover, the correla- 
tions with the criterion varied in the same sample from —.01 for Form D 
to +.27 for Form E. Finally, the three forms were not equivalent at 
precisely those ‘ievels where the PT score was most important: if we ex- 
amine the pro, ortion of men eliminated by the absolute cutting score and 
the proportion of men in the range within which a low score on the PT 
combined with a low score on the Mechanical Comprehension Test led to 
rejection, we find that both of these proportions are twice as great for 
Form D as they are for either one of the other two forms. 

The limitations of the PT led to early efforts to develop an improved 
test. The Aviation Classification Test replaced the PT in October 1942, 
after all of the cadets studied in this report had been selected. A test 
with 112 items and a 45 minute time limit, its reliability is over .90. In 
building this test, particular emphasis was placed upon maximizing its 
capacity to discriminate in the lower total score range which includes 
the cutting scores used for rejecting applicants. 

2. Bennett’s Mechanical Comprehension Test (MCT). George K. 
Bennett of the Psychological Corporation constructed for the Navy 
several forms of his Mechanical Comprehension Test. In this test, the 
verbal factor is minimized because each item includes a drawing to illus- 
trate the problem, which is set by a simply-worded question. The con- 
' tent includes problems dealing with the functioning of levers, pulleys, 
electrical wiring, etc. The 45 minutes allowed for the 76 items is suffi- 
cient for a large proportion of applicants to attempt all items. 

The split-half reliability of the MCT is relatively low, perhaps be- 
cause of its heterogeneous content. For a group of 439 men scoring above 
19 on the PT (well over 90% of the distribution of applicants), the split- , 
half reliability was .80 (odd-even, corrected for full length). One effort 
to divide the test into two halves with equated content yielded a slightly 
lower coefficient! In other groups, the test-retest coefficients varied from 
.84 to .87. The reliability in terms of agreement between two forms was 
.72 for a population of male high school seniors. 

The purpose of the MCT is to measure knowledge of “barnyard 
physics,” not rote learning of textbook principles. But does previous 
training in physics raise scores on the MCT? While the available data 
do not afford a definite answer to this question, they do indicate that the 
association between MCT scores and training in physics is not marked. 
Coefficients of contingency for two samples express the relationship be- 
tween MCT score and presence or absence of training in physics as .27 
and .29 respectively. 
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A detailed breakdown by education and PT score level is made in 
Table 1. It will be seen that the average increment in MCT score as- 


Table 1 


Mean Mechanical Comprehension Test Scores for Cadets With and Without Training 
in Physics (for Different Personnel Test Score Levels) 








I. 917 Cadets with Less than Two Years of College Education 








PT No Courses High School College 
Score in Physics hysics Physics Total 
35-50 54.5 (55.7)* (57.5)* 55.3 
31-34 52.0 55.0 (55.6)* 53.1 
27-30 49.6 50.6 54.8 50.4 
0-26 49.0 50.2 52.1 49.4 
Total Group 50.3 51.9 54.6 51.0 





II. 1739 Cadets with at Least Two Years of College Education 








PT No Courses High School College 
Score in Physics Physics Physics Total 
35-50 56.1 54.1 59.9 56.7 
31-34 52.4 53.1 57.9 54.1 
27-30 50.8 49.9 54.2 51.3 
v-26 50.0 50.2 53.5 50.8 
Total Group 51.6 51.6 56.2 52.7 





* Values in parentheses are means based on less than 24 cases. 


sociated with high school physics courses is quite small, under two points; 
on the other hand, the average increment associated with courses in 
college physics is about four points. Since the size of these gains is 
relatively constant for each PT score level, and since the relationship be- 
tween PT score and training in physics appears to be negligible (.13 in 
one sample), the type of intelligence measured by the PT is probably not 
responsible for the higher MCT scores associated with such training.’ 
This material does not prove that academic training in physics affects 
MCT scores. While we might assume that the small gains are the result 
of training in physics, we cannot overlook another consideration: people 
tend to take courses in subjects which interest them, and people tend to 
1 MCT scores in this paper are the number right, the Aviation Psychology Branch 
having dropped the original “Rights minus One-half Wrongs” formula to simplify 
scoring and reduce scoring errors. Since the time limit permitted almost everyone to 
try every item, the two scoring formulas give essentially equivalent discrimination. 
_* There is no apparent explanation, other than sampling error, for the reversals of 


trend in the first and third rows of the lower half of the table (54.1 and 49.9). Each 
of these two means is based on about 80 cases. 
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be interested in those fields for which they have some aptitude. There- 
fore it is possible that cadets with some natural ability in ‘‘mechanical com- 
prehension” liked physics and took courses init. If so, then the differences 
reported above would be due to differences not in training but in a specia 

ability. Whatever the etiology of these trends, the MCT has been re- 
peatedly found to predict success in flight training, and its continued value 
does not appear to be vitiated by the findings reported in this section. 

3. The Biographical Inventory (BI). The Biographical Inventory was 
developed in 1940-41 by E. Lowell Kelly and others under the auspices 
of the CAA-NRC Committee on the Selection and Training of Aircraft 
Pilots (5). While the original form—designed to discriminate between 
good and poor civilian pilots—contained several hundred items, the form 
used by the Navy during most of the war had 150 items on biographical 
topics, interests, habits, attitudes, and preferences. Unlike the PT and 
the MCT, the BI has no a priori right or wrong answers. Its key was 
based on an analysis of the responses of cadets who later passed or failed 
in flight training. 

Administered on an experimental basis throughout 1941, the BI was 
first used three weeks after Pearl Harbor, not as a screening device but as 
relevant evidence in cases where students were doing poorly in flight 
training. Hither a BI score of E (in the bottom 3%) or a BI score of D 
(in the next 22%) together with a low MCT score were to be considered as 
indicating an extremely poor probability of completing flight training; if 
such cases failed two out of three flight checks, they were to be dropped 
without additional time. The BI continued to be used in this “advisory” 
fashion throughout 1942. The last day of that year saw the introduction 
of a new index, the Flight Aptitude Rating, which was based on combina- 
tions of BI and MCT scores. In the latter part of the war, when large 
numbers were available to fill small quotas, the Flight Aptitude Rating 
was used in selection—the minimum qualifying score being raised or low- 
ered as the needs of the service dictated. 

No applicant for flight training was ever rejected solely on the basis of 
his BI score. One reason for avoiding the use of the BI as a separate 
selection instrument was its susceptibility to faking, to an indeterminate 
extent. Since the test was originally validated on men taking it after 
they had been accepted for training, it was impossible to know how 
good a predictor it would be if administered before that stage. The 
eventual use of the Flight Aptitude Rating minimized this difficulty: 
even if the BI were ineffective as a selection device, the Flight Aptitude 
Rating would nevertheless contain the discriminatory power of the MCT. 

Experience with item-analysis of the BI has shown that its items are 
quite unstable although the validity coefficients for total score are reason- 
ably stable. For example, one key gave typical criterion correlations 
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for each of two samples, yet separate item-analyses showed that, while 
some responses discriminated well in both groups, others discriminated 
well in one group and not at all in the other, and some responses even 
worked in opposite directions in the two groups—that is, a given answer 
might be significantly associated with passing in one group and with 
failing in the other! 

A similar difficulty was encountered in an attempt to develop a BI 
key specifically to predict flight failures, instead of all failures taken as one 
group. From an item-analysis which compared 400 flight failures with 
675 graduates, a key was constructed and tested on two other independent 
samples of 423 and 202 respectively. Although these samples contained 
only flight failures and passers, the correlations with pass-fail were in 
both cases lower for the flight-failure key than for the usual key. 

The key in use for most of the war was based on i141 passers and 
673 failures. It proved to be relatively satisfactory because the large 
number of cases reduced sampling errors and increased the significance 
of the obtained differences. As might be expected, however, the relia- 
bility was not high. Test-retest reliability was in the order of .70 for a 
group of almost 2000 men who took the test for a second time three to 
twelve months after the first administration. For another group, the 
correlation between part-score based on items with positive weights and 
part-score based on items with negative weights was .79 (corrected for 
the full length of the test). 

4. Scores and grades. The correlations reported in this paper will 
be based on distributions with few class-intervals, in order to approximate 
the results to be expected with the five letter-grades used during most of 
the war. Since administrative decisions regarding selection and elimi- 
nation involved letter-grades, not raw scores, it is appropriate that the 
usefulness of the tests be assessed in terms of comparably coarse group- 
ings. A secondary consideration is the finding that, for the BI and the 
MCT, correlations computed from raw scores were all within .02 or .03 
of comparable correlations computed from five or six class-intervals or 
letter-grade groupings. 


The Populations 


Between February 1941 and V-J Day 1945, several hundred thousand 
young men applied for Navy flight training. The changing conditions of 
selection and training, as well as the varying influence of world events, 
renders it impossible to make meaningful studies of all these applicants 
or even of all accepted cadets taken as one group. Hence we shall ex- 
amine in this paper the data for certain delineated samples. 

1. General selection factors. The men in the populations to be con- 
sidered in this report were selected on the following factors: (a) They were 
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between 18 and 26 years of age (inclusive) at the time of application. 
To be more precise, the 1941 group and the early 1942 group were between 
20 and 26 years of age (inclusive), while the third sample included men 
between 18 and 26. (b) They were all high school graduates. Those in 
the first two groups had, in addition, at least two years of college. (c) 
They were unmarried. (d) They were able to pass the strict flight physi- 
cal examination. (e) They all volunteered for flight training. 

In addition to these definite characteristics, there were general and 
intangible factors operating to delimit these samples. By definition, 
some interest in flying is common to all aviation cadets. While many 
were highly motivated, all of them at least preferred aviation to other 
types of military duty at the time of application. 

2. July-November 1941 Sample (Sample A). In order to study all 
available cases where the criteria were unaffected by test scores, a research 
sample was established which included aviation cadets entering training 
in the first ‘‘elimination” stage between July and November 1941, in- 
clusive. The group was limited to these months because in the first half 
of 1941 the testing program was not fully organized and there were not 
enough psychologists in uniform to insure systematic testing of all ap- 
plicants. Furthermore, one objective was to compare the value of the 
three tests for the same population, and although the PT and BI were in 
use from February, the MCT was not introduced until June. 

While the sample does not include all men believed to have entered 
training in this period, the available evidence gives no indication that 
it is biased in any way relevant to the purposes of this study. Sample A 
has one special characteristic: its members applied for flight training 
before the war started, many of them looking forward to a career in com- 
mercial aviation. 

3. March-April 1942 Sample (Sample B). Since the men in this 
sample entered primary flight training in March and April of 1942, within 
a few days of their application, they were presumably selected and trained 
under the provisions of the February 1942 directive which set low cutting 
scores for the PT and the MCT and which also provided for introducing 
MCT and BI scores as significant evidence in the consideration of candi- 
dates whose performance was borderline. 

4. September—October 1942 Sample (Sample C). Unlike Samples A 
and B, this group applied for flight training under broadened require- 
ments admitting high school graduates and men of 18 and 19 years of 
age. Furthermore, as far as can be determined, all in Sample C had at- 
tended pre-flight school before entering primary training in September 
and October, whereas those in the first two samples had not. Since it 
was not possible to set up this sample in terms of date of entrance into pre- 
flight schools, Sample C omits an estimated 1 or 2% of originally accepted 
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applicants who failed at these schools for reasons other than poor flying 
ability (no flying was done in pre-flight schools). There is no reason to 
believe that this group was particularly affected by subsequent changes 
in the use of test scores for elimination from training or by the sharp 
reduction in training quotas. No later samples are mentioned in this 
report because it is difficult, if not impossible, to evaluate the effects of 
subsequent changes in the training program and in training policy upon 
obtained validity coefficients for the various tests. 
The Criteria 

The objective in this testing program was to select those men who 
would make the best members of a combat organization: Since there 
was, in the early part of the war, no definite information on the charac- 
teristics of good and poor combat pilots, the immediate goal at that time 
was the identification of good and poor cadets as indicated by their success 
or failure in flight training. Obviously this is a necessary intermediate 
criterion, since no man could be a success in combat if his failure in train- 
ing prevented him from reaching operational duty. It was always pos- 
sible that this policy of selecting men most likely to succeed in training 
might cause the elimination of those men most likely to succeed in combat. 
Only with the subsequent analysis of combat criterion data was this 
hypothesis shown to be untenable. 

1. Outcome of training. This principal training criterion was used 
almost exclusively by the Aviation Psychology Branch because it was an 
objective fact. While the circumstances attending a failure to complete 
flight training might occasionally be as fortuitous as an academic failure 
in education, training outcome is the concrete administrative datum. 

In view of tae dichotomous nature of this primary criterion, the 
biserial coefficient of correlation is the statistic used throughout most of 
this paper. Moderate relationships between test scores and a criterion 
become impressive when presented in terms of percentage failing for 
each test grade (cf. Figure 1), especially when the test score categories 
are so numerous that each contains only a small proportion of the total 
population. While such graphs may be useful in discussing with admin- 
istrators the practical implications of validity coefficients, the use of a 
single value such as the biserial r facilitates comparisons of test validities 
for different tests and different groups. 

2. Reason for failure. This criterion is really a refinement of the first 
one. If selection tests are to eliminate potential failures, they should 
discriminate on each of the major continua, the lower ends of which are 
failure. The primary variable is flying ability, the largest group of failures 
being those designated as deficient in this respect. Performance in 
ground school is another ability scale—a small proportion of cadets were 
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dropped for failure to master the class-room courses in the syllabus. A 
different type of variable is motivation, indicated at the low end by those 
cadets who were dropped at ‘heir own request. The proportion of cadwvts 
who failed for each of the various other reasons is so small that we e«n 
disregard these reasons except for noting that such failures increase the 
heterogeneity of the failure group as a whole. 

The true reason for failure is not always known. For example, a 
cadet who wished to be discharged might intentionally do poorly in his 
flying. However, the only available data, the official reasons, provide us 
with failure groups the empirical characteristics of which are sufficiently 
different to warrant analy “; by separate groups. 

3. Other criteria. The number of flight hours which a cadet takes to 
graduate is an index of his speed of learning to fly. To save space, data 
on this criterion will not be presented in this paper. As one might expect, 
for each test, the cadets who obtained high scores tended to graduate 
with fewer flight hours than those who obtained low scores; among the 
failures, cadets earning high scores tended to remain in training longer 
than those receiving low scores. 

One may question the omission of the obvious criterion of flight grades. 
Apparently the number of grades required of the busy instructors was too 
great to permit them to mark carefully, for detailed analysis showed that 
flight marks did not differentiate sufficiently to provide a reliable criterion. 
Furthermore, the correlations between flight marks assigned during 
various training stages were low. 


Validation Against Training Criteria 


1. The Personnel Test. Table 2 indicates that the biserial correlations 
between PT score and pass-fail for the various samples are low and show 
little variation in predicting outcome of flight training. Perhaps, in the 





Table 2 


Personnel Test: Criterion Correlations for Failure Groups * 








Ground Dropped All 
Flight School at Own Other All 
Sample N Failures Failures Request Failures** Failures 

















r N r N r N r N r N 
A 2356 12 (452) 31 (45) .04 (56) 14 (42) 17 (595) 
B 1818 .08 (295) .20 (24) —.04 (56) 14 = (72) -ll (447) 
C 2073 .08 (228) — (13) 07 (76) 01 (101) 08 (405) 

















* Each coefficient is a biserial r comparing the designated failure group with all other 
entrants (passers plus any failures not in the designated group). 
** All failures except those groups for which correlations are cited. 





Naval Aviation Cadet Selection Tests 609 


later samples, the curtailment of range due to using the test for selection 
more than offset the increase in variability which followed the lowering of 
educational requirements. In all samples, the values are quiie low for a 
test which was used as a selection device. However. there were a priori 
reasons for requiring a minimum score on the PT: naval aviators should 
have a certain minimum amount of verbal intelligence in order to under- 
stand complex instructions and to make sound judgments. Furthermore, 
the PT was expected to help weed out potential ground schoo! failures. 
It served this purpose, although the number of such failures was small 
during most of the period of its use. The PT was of little value for pre- 
dicting other types of failures. | 

The Aviation Classification Test, which replaced the PT, made a more 
important contribution, because the number of ground school failures in- 
creased with the lengthening of the academic part of the training course. 
Because the ACT was introduced after the three samples had been se- 
lected, no detailed validation data will be presented in this report. The 
available evidence indicates that the ACT, like the PT, was useful pri- 
marily for the indentification of potential ground school failures. 

2. The Mechanical Comprehension Test. Criterion correlations for 
the MCT are given in Table 3. The most striking feature is the narrow 











Table 3 
Mechanical Comprehension Test: Biserial Criterion Correlations for Failure Groups 
Ground Dropped All 
Flight School at Own Other All 
Sample N Failures Failures Request Failures* Failures 
r N r N r N r N r N 


A 2356 33 (452) .25 (45) ll (56) 11 = (42) 35 (595) 
B 1818 27 (295) 32 (24) 16 (56) 14 (72) 32 (447) 
Cc 2073 29 (228) -—— (13) 10 (76) 12 (101) 27 (405) 





* All failures except those groups for which correlations are cited. 


range of the coefficients within each column. The difference between the 
values for ‘‘All Failures” for Samples B and C (.32 and .27 respectively) 
is probably a consequence of restricted range following the introduction 
of the MCT into selection after Sample B had applied for flight training. 
This constricting effect seems to have been greater than the augmenting 
effect of using MCT scores occasionally in making administrative deci- 
sions about cadets whose performance was borderline. 

The table also indicates that the MCT predicted both flight and 
ground school failure well. While the correlations for other types of 
failures are low, they are all positive. Their small size is in part a func- 
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tion of the method of correlational analysis: since these biserial coeffici- 
ents compare each group of failures with a group composed of all passers 
and all other failures, we are, in effect, asking the test to differentiate 
between each of these small groups and a population showing considerable 
variability because of its heterogeneity. 

3. The Biographical Inventory. BI validity coefficients for the three 
samples are given in Table 4. There is once again a remarkable consist- 
ency in the values for “All Failures,”’ especially when the comparative un- 
reliability of the test is recalled. Moreover, Sample C was scored with 
a revised key which differed considerably from previous keys used to 
score the earlier samples. Although the BI turned out to be of slight 
usefulness in identifying other types of failures, it was originally expected 
to predict flight failures, a task it did well. 

4. The Flight Aptitude Rating. For administrative reasons, the grades 
on the MCT and BI were combined into a Flight Aptitude Rating. Since 
the FAR was not used in selection until 1943, we shall report no FAR 
studies for our three samples but shall summarize other analyses. Of 
prime significance was the study illustrated in Figure 1, based on all cases 





Fig. 1. Percentage of failures for B.I. and M.C.T. combinations. This chart is 
based on 3294 cases where tests were not used in the original selection. Shading indi- 
cates that the percentage of failures in that cell is not statistically reliable. 


entering training before December 1941, for whom pass-fail data were 
available in October 1942; this population overlaps greatly with our 





q 
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Table 4 
Biographical Inventory: Biserial Criterion Correlations for Failure Groups 








Ground —— All 
at Own 





Flight School Other All 
Sample N Failures Failures Request Failures* Failures 
r N r N r N r N r N 


A 2356 29 (452) 06 (45) 18 (56) 04 (42) 30 (595) 
B 1818 34 (295) 09. (24) 14 (56) l= (72) 33 (447) 
Cc 2073 34 (228) — (13) 09 (76) 23 (101) 35 (405) 





* All failures except those groups for which correlations are cited. 


Sample A. Although the failure rates for individual cells (i.e., for men 
receiving a given combination of BI and MCT grades) are not all reliable 
because of the small frequencies in some of them, the range in these rates 
is the maximum, from 0% to 100%. Combinations of cells were grouped 
together into Flight Aptitude Groups, as indicated by the Roman numer- 
als above the failure rates. The range of failure rates for these groups 
is from 12% to 52% (see Table 5). 














Table 5 
Failure Rates for Flight Aptitude Groups 

Flight Percentage of 
Aptitude Propestion Population in 
roup of baiures Each Group 

I i2 17% 

II 19 25% 

III 25 25% 

IV 3A 20% 

Vv 52 138% 





Another research study was made in early 1945 on cadets entering 
training in July, August and September 1941, in which the raw test scores 
were converted into the revised 1943 letter-grades, and for which all BI 
papers were rescored with the revised BI key in use at the time of the 
study. The results, which were essentially the same as those cited above, 
were employed to set up the FAR groups in administrative use for the 
next year and a half. While in this population the BI and the MCT 
correlated .33 and .36 with the criterion, the biserial r between pass-fail 
and FAR was .43. Since this latter value was exactly the same as the 
multiple R between. pass-fail and these two tests, it is apparent that the 
FAR made the maximum possible use of the differentiations provided by 
the tests. Subsequent studies yielded similar findings. 








: 
: 


612 Donald W. Fiske 


The Selection Battery: Intercorrelations and 
Multiple Criterion Correlations 


The three sets of intercorrelations among the selection tests have the 
same degree of consistency which was noted above for the criterion cor- 
relations of each test. Typical are those for Sample B: PT and MCT, 
.30; PT and BI, .05; MCT and BI, .25. The absence of relationship be- 
tween the BI and the PT is to be expected, since these predictors measure 
completely different aspects of the individual. The correlation between 
the two tests of mental ability is fairly low because of the difference be- 
tween abstract-verbal ability (PT) and mechanical comprehension 
(MCT) and because of the relatively curtailed range of ability found in 
applicants for flight training. The relationship between the MCT and 
the BI may be in part the association between ability and interest—per- 
haps young men with ability to understand physical principles tend to 
develop interests in activities—such as flying—that require such ability. 

Even the highest of these correlations indicates a rather small degree of 
overlap between the factors measured by the tests. While it would be 
advantageous to use predictors with no intercorrelation, such a condition 
is rarely possible. The BI-MCT relationship is sufficiently low to enable 
the two tests to be used together, their grades being combined into an 
FAR which predicts better than either test by itself. 

The multiple correlations between outcome of training and the BI plus 
the MCT are .41 for each of the three samples, a value which is apprecia- 
bly higher than the separate r’s for each test alone. Because the criterion 
correlations for the PT are low, it adds almost nothing to the prediction 
from the BI or the MCT. 


Other Factors Related to Test Scores and to the Criterion 


1. Age. The correlation of age with success in training was —.19 
for Sample A, indicating that the younger cadets had a slightly better 
chance of graduating. The failure rate varied from 12% for 20-year-olds 
to 38% for 26-year-olds. Since the correlation of age with each of the 
three tests was negligible, the multiple R’s between the criterion and each 
test plus age were .04 to .06 higher than the corresponding r’s for each test 
alone. 

2. Previous flight training. Previous flight training was correlated 
to a small extent with success in Navy flight training. Such experience 
not only gave some cadets the advantage of extra practice but also in- 
dicated strong or lasting interest in flying. Analyzing Sample B, we 
find that outcome of training was related to previous flight training .24, 
somewhat less than outcome of training was related to scores oa the BI 
and the MCT. While the BI correlated .29 with previous training, 
it is apparent from Table 6 that the BI is far from being merely a meas- 
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ure of this factor. In fact, the BI predicted almost as effectively for 
cadets with no previous training as it did for all cadets. Since the PT 
and the MCT have small correlations (.11 or less) with previous training, 
their predictive efficiencies were even less affected by this factor. 


Table 6 
Criterion Correlations for Groups With and Without Previous Flight Training 








Biserial Correlations with Training Outcome 





Previous Mechanical 
Flight Personnel Biographical Comprehen- 
Training Sample N Test Inventory sion Test 





None 1187 Al .30 30 
None 1451 08 al .28 


Some* 629 .24 13 .29 
Some* 486 10 .24 38 


All cases** 1818 ll 33 32 
All cases** 2073 .08 “30 27 





* The coefficients for cadets with previous training are unreliable because of the 
small failure rates (10% and 7% respectively). 

** Information on previous flight training was not available for 2 cadets in Sample B 
and for 136 cadets in Sample C. The omission of these latter cases accounts for the 
fact that, for Sample C, the MCT correlations for men without and with previous flight 
training are both higher than that for all cases. 


Other Experimental Tests 


While the tests mentioned above were the only ones used in the selec- 
tion of naval aviation cadets, the Aviation Psychology Branch tried out 
several other tests. Although some psychomotor tests were given experi- 
mental trials and yielded promising criterion correlations, they were never 
included in the selection battery because of practical and administrative 
considerations. 

Among the experimental paper-and-pencil tests were one measuring 
motivation for flying and one measuring information about aviation. 
The motivation test showed sizable criterion correlations but its high 
correlation with the BI prevented it from making a significant addition 
to the prediction from the existing battery. The information test was less 
useful, particularly because it was too closely related to previous flight 
training, but also because its items were affected by the rapid changes in 
aviation during the war. 


Summary 


Three selection tests were used to screen applicants for Navy flight 
training during the early part of the war: the Personnel Test, the Mechan- 
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ical Comprehension Test, and the Biographical Inventory. This paper 
reports the correlations between each test and outcome of training, as 
obtained on each of three samples of cadets entering primary flight train- 
ing. While the Personnel Test had low correlations with success in 
training when all failures were grouped together, it was useful in predict- 
ing ground school failures. The Mechanical Comprehension Test showed 
a useful relationship with outcome of training and identified flight and 
ground school failures. The Biographical Inventory predicted flight 
failures better than failures for other reasons. The intercorrelation be- 
tween the Biographical Inventory and the Mechanical Comprehension 
Test was sufficiently low to permit combinations of scores on these tests 
to predict the criterion substantially better than the prediction from either 
test alone. Both age and previous flight training were also related to the 
criterion. 

Received February 28, 1947. 
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The Analysis of Personnel Data in Relation to 
Turnover on a Factory Job 


Joseph Tiffin, 
Purdue University 


B. T. Parker, and R. W. Habersat 
Bausch & Lomb Optical Company 


The hiring and training of employees for even routine and unskilled 
jobs becomes unreasonably expensive if the turnover of.employees on the 
job is high. The purpose of this report is to illustrate how an analysis 
of personnel data obtained at the time of employment for all new em- 
ployees in a certain department revealed certain significant differences 
between employees who later acquired long tenure and employees who 
stayed on the job for only a short time. 


Procedure 


The turnover of employees in one department of an optical manu- 
facturing company had reached rather serious proportions. A personnel 
testing program had reduced somewhat the rate of turnover, but had not 
succeeded in reducing the rate to a satisfactory level. The present in- 
vestigation was conducted to determine whether a consideration of per- 
sonnel data would reduce still further the frequency of employee termi- 
nations. 

For the present investigation, records of two groups of employees 
were chosen for study. One group consisted of 27 employees who were 
still on the job nine months after employment. A period of nine months 
was chosen because any reasonably satisfactory employee remaining on 
the job for this length of time more than repaid the cost of hiring and 
training him on the job. This group will be referred to as ‘‘Long Tenure 
Employees.”’ A second group consisted of 60 employees who left the 
job, often without notice, within three months after employment. This 
period was chosen because any employee leaving prior to three months 
involved a definite loss to the company in hiring and training cost. This 
_ group will be referred to as “Short Tenure Employees.”’ All employees 
in both groups were male. The following personnel data on all employees 
in both groups were obtained at the time of employment: 1. Age; 2. Years 
of formal education; 3. Height; 4. Weight; 5. Marital status; « 1d 6. Num- 


ber of dependents. 
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Table 1 shows the results of a statistical analysis of personnel data on 
the items listed above for the “Long” and “Short” tenure employees. 
The statistical significance of each obtained difference is shown in the 
last column of Table 1. 


Table 1 


Analysis of Personnel Data for “Long” and “Short” Tenure Employees 
Hired for a Specific Factory Job 














“Long’ Tenure ‘Short’ Tenure Difference 
Employees Employees Between 
(9 months (under 3 “Long” and Critical 
Personnel Data or over) months) “Short” Ratio 

Age 30.8 years 25.7 years +5.1 years 4.84* 
Years of Education 9.7 years 10.6 years — years 2.04** 
Height 68.6 inches 69.1 inches — .5inches 14 
Weight 163.4 pounds 158.3 pounds +5.1 pounds 1.16 
Marital Status 59% married 37% married + 22% 1.95** 


Number of Dependents 1.68dependente .72dependents +.95dependents 3.01* 





* Significant at 1% level or less. 
** Significant at 5% level or less. 


Results 


The results shown in Table 1 show quite conclusively that, at the 
time of employment, employees who stay at least nine months on the 
job are older, have had less formal education, are more frequently married, 
and have more dependents than employees who leave the job prior to 
three months. In hiring for this job in the existing labor market and 
under the general conditions prevailing when this investigation was con- 
ducted, employees should be sought who are at least 30 years old, have 
not finished.over 10 years of formal schooling, are married, and have at 

least one (and preferably two or more) dependents. 


Summary 


Analysis of personnel data of the type herein described is simple, 
rapid, and may often be made from data available from standard employ- 
ment procedure records. When differences of the type found in this 
study are revealed by such an analysis, the employment interviewer is 
immediately furnished with a helpful additional tool to guide his judg- 
ment concerning the likely tenure of an applicant if he isemployed. The 
same method of analysis may also be used with other criteria of job suc- . 
cess. : 


Received August 4, 1947. 
Early publication. 





A Classification and Evaluation of Personnel Rating Methods 


Edwin B. Knauft 
The State University of Iowa 


In recent years business and industry have felt a need for more pre- 
cise methods of evaluating the performance of workers. Sometimes it 
is possible to evaluate them on the basis of quantitative data such as the 
number of units produced or the amount of goods sold. More frequently 
the output of the worker cannot be measured in such an objective manner 
and it is then necessary to resort to personal estimates of the employee’s 
value to hisemployer. When such estimates must be made, it is custom- 
ary to attempt to “quantify” the judgments by means of some type of 
merit rating method. Frequently a rating system is initiated by an in- 
dividual who has no understanding of the techniques of behavior measure- 
ment, with the result that several of the widely used rating methods ! lack 
the precision offered by available psychometric methods. 

At the present time there is a need for a systematic evaluation of the 
various rating methods, with special emphasis on (1) the procedures in- 
volved in the construction and use of each type of rating scale, (2) the 
applicability of given rating methods to given situations, and (3) the 
comparative precision offered by the different methods. Such an evalua- 
tion may best be made after the rating techniques have been classified in 
some manner. Since no suitable classification is available, an operational 
classification of rating methods is here presented with the aim of stimu- 
lating a more critical approach toward the many rating systems now in 
use 


The suggested classification of rating methods is based on (1) the 
operations performed by the experimenter or scale maker in constructing 
the rating scale,? (2) the operations performed by the raters or judges 
when rating a given individual on a scale, and (3) the operations involved 
in devising a scoring method for the scale. With such operational classi- 
fications at hand it will be possible to evaluate the rating methods and 


1 For a comprehensive survey of specific personnel rating methods currently used by 
business and industry, cf. National Industrial Conference Board, Studies in Personnel 
Policy, No. 8, 1938, and No. 39, 1942. 

* For the purposes of this discussion, the term “rating scale’’ will be broadly used 
to designate any sysie:natic method of evaluating individuals in the absence of, or 
supplementary to, quantitative records of the individual’s output. 
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determine which methods are most applicable in certain concrete situa- 


tions. 


I. Operations of Rating Scale Construction and Use 
The classification of rating scales and methods, based on construction- 








operations and use-operations, is summarized in Table 1. Practically 
Table 1 
Operational Classification of Rating Methods 
Operations of Scale Construction Operations of Scale Use Name of 
by Experimenter by Rater Method 
Compiles list of names of ratees for Ranks individuals on list Rank order 
the use of the rater from best to worst 
Compiles pairs of xames of ratees in Determines which ratee is Paired com- 
which each name is paired with every the better of each pair parisons 
other name 
Determines and defines separate Determines where ratee falls Linear 
traits to be rated and constructs a on each trait continuum; Alphabetic 
continuum or several discrete inter- may also write in reasons Numerical 
vals for each trait, placing “‘guide- for his rating Graphic 
posts” along each continuum Defined distri- 
bution 
Behaviorgram 
Determines and defines traits to be Matches each ratee with Man-to-man 
rated and directs raters to select and one of five individuals com- 
place five individuals at five repre- prising comparison stand- 
sentative points on trait continuum ard group 
(a) Collects large number of behav- Determines which items in Weighted 
ioral descriptions applying to work the list apply to or describe random 
tatees are doing; (b) requires group behavior of ratee check list 
of judges to sort or rank statements 
using one of psychophysical methods; 
(c) selects final items on basis of scale 
value and dispersions obtained in (b) 
(a) Collects large number of behav- Selects alternatives within Forced choice 


ioral descriptions or adjectives ap- 
plying to work ratees are doing; (b) 
obtains criterion measure of individ- 
uals who form scale standardization 
group; (c) selects final items on basis 
of their differentiating value, using 
criterion sub-groups 


each item as being most de- 
scriptive and least descrip- 
tive of ratee 





all types of rating scales are represented in this table by their common 
names, but different combinations of construction-operations and use- 
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operations may result in numerous other types of scales. For example, 
the descriptive phrases along the continuum of the graphic scale could 
be chosen by one of the psychophysical methods, based on the composite 
rankings of the phrases by a number of judges. A second combination 
of methods might be to use criterion groups to measure the differentiating 
value of items in a random check list in order to obtain the best set of 
items for the final form of the scale. 

Special mention should be made of two types of rating methods which 
are included in the table but have not been widely used. The weighted 
random check list utilizes Thurstone’s equal appearing intervals tech- 
nique (11) for the selection and assignment of weights to scale items. 
This method was first applied to industrial merit rating by Richardson 
and Kuder (9) and more recently by Marble (7). The forced choice 
technique as developed by the Personnel Research Branch of the Adjutant 
General’s Office (10) takes the form of a series of multiple choice items 
which the rater must check with respect to the ratee. The selection and 
weighting of alternatives within each item is accomplished by item analy- 
sis methods which require an external criterion measure of a large group 
of ratees. 


II. Operations Related to the Construction of a Scoring Method 
for the Rating Device 


In order to complete the operational analysis of personnel rating 
methods, it is necessary to enumerate the operations involved in con- 
structing a scoring method for the completed rating form. The majority 
of. these scoring and weighting techniques are not new to psychology; 
rather they represent the application of established psychometric tech- 
niques to the field of personnel rating. Due to the recent interest in 
these scoring refinements, they will be considered in some detail. 

(1) Use of a technique for determining the relative standing of each 
ratee when all ratees have been compared directly with each other. If 
a number of judges are available to rank the same group of ratees, it is 
possible to compute a “scale value”’ for each ratee. When the rank order 
method has been used by the raters, the scale values for each ratee may be 
computed by the methods of Guilford (5, pp. 250/7) or Hull (6, pp. 382ff). 
The method of Guilford (4) is applicable when the original data have been 
collected by the paired comparisons method and each ratee has been 
judged by a number of raters. 

(2) Operations used to determine what weight should be assigned to 
each of several trait continua which comprise the rating scale battery. 
Several common types of scales, such as the graphic, linear and numerical, 
normally contain a number of different sub-scales or trait continua. By 
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the decision of the scale maker each of these traits supposedly bears some 
relation to the total success or efficiency of the ratee in a given job.* It 
is sometimes assumed that certain traits or factors arc more important to 
total job success than other traits, and hence the various traits are differ- 
entially weighted when a ratee’s total score is computed. The trait 
continua may be weighted as follows. 

(a) The scale maker assigns an arbitrary weight to each trait. This 
method, often employed in industrial merit rating forms, is usually based 
on a job analysis of the ratee’s work situation. This analysis indicates 
which characteristics are important for success on the job. 

(b) A refinement of the above procedure consists in requiring a num- 
ber of “experts” familiar with the job in question to weight each trait or 
factor in terms of its relative importance to total worker success. The 
experimenter may follow the method of Toops (13, pp. 295ff) and in- 
struct the judges to distribute 100 points or bids among the various traits 
so as to reflect the relative importance of each trait. The final weight 
for each trait is then based on the mean number of points assigned to the 
trait by all the judges. 

(c) A third technique requires an 2 “hedeendent” criterion measure on 
the ratees and then utilizes a regression equation to find the optimal item 
or trait weights. Bolanovich (1) used as a criterion the supervisor’s 
judgment as to whether or not he would recommend the ratee for pro- 
motion. A regression technique was then applied to the rating scale 
data to find item weights which would yield a maximum correlation with 
the criterion. 

(3) Operations used to determine the value of the units on each sepa- 
rate trait continuum. After the relative value of each trait has been 
determined by one af the above methods, the scale maker encounters 
the problem of assigning values to represent varying degrees of possession 
of each trait. 

(a) The simplest method consists of arbitrarily assigning progressive 
integers to the successive “equal” units of the trait continuum. For ex- 
ample, a nine-point scale designed to measure “ability to get along with 
fellow workers’ would designate the low end of the continuum with a 
value of ‘‘1”’, the middle or average point as “5” and the high end of the 
continuum as “9’’. In common practice the numerical value nearest to 
the point checked for the ratee is then multiplied by the trait weight to 
yield a trait score. 








* There is evidence that a scale maker may include too many overlapping trait scales 
in a rating battery. Ewart, Seashore and Tiffin (2) subjected a merit rating battery of 
12 graphic trait scales to a factor analysis and found that only three distinct factors 
were being measured. Similar results were obtained by Bolanovich (1) in a different 
industrial situation. 
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A modification of the method consists of placing a millimeter scale 
along the line representing the trait continuum. The value of the check 
mark indicating the ratee’s position is read as the distance in millimeters 
between the check mark and the low end of the continuum. Such 
methods are based on the assumption that the trait continuum is com- 
posed of a series of equal units. 

Sometimes the units are not considered as being equal or the various 
degrees of trait possession are indicated as discrete intervals, and are 
listed below each other in the form of a five-step check list. In this case 
a weight is assigned to each discrete interval by the scale maker or a com- 
mittee of judges. ' 

(b) The score of each ratee on each trait is determined by one of the 
above methods and then this raw trait score is converted into a standard 
score based upon the distribution of the scores received by all the ratees 
on that trait. This method, suggested by Tiffin and Musser (12), is an 
improvement in scoring technique which takes into account the varia- 
bility of ratings received on each trait of the graphic type of scale. The 
standard score method is particularly useful when the scores received by 
the ratee on several traits are combined to give a composite score. If 
the traits themselves are differentially weighted, the ratee’s standard 
score on the given trait is multiplied by the trait weight before the scores 
are totaled. 

(4) A different type of scoring procedure is available when the scale 
has been constructed by requiring a number of judges to sort each of a 
number of behavior descriptions on a continuum from “characteristic of 
the very poor employee” to “characteristic of the most valuable em- 
ployee.” This procedure, used in the construction of the weighted ran- 
dom check list, yields scale values or weights for each descriptive item. 
These weights are the median scale values assigned to the items by alli the 
judges. After the check list for a given ratee has been completed by a 
rater, the composite score for that ratee is found by computitig the mean 
or median of the scale values of the items which the judge has checked 
as applying to that ratee. 

(5) The scoring weight for each item or behavior description is deter- 
mined from an item analysis, using criterion standardization groups. 
The weight assigned to each item in a check list or to each alternative 
in a multiple choice item is based on the degree to which the item or alter- 
native discriminates between “high” and “low” criterion sub-groups. 
A modification of this technique has also been used by Ferguson (3) to 
correctly weight the intervals on a “degree of possession” continuum. 
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III. The Appropriateness of the Rating Methods 


Now that the operations of rating scale construction, use and scoring 
have been described and classified, it is possible to proceed with a system- 
atic evaluation of the rating methods. The primary emphasis in the 
evaluation will be in terms of the intended purpose of the rating method 
in a given industrial situation. It cannot be arbitrarily assumed that 
any one set of operations will yield the best rating device for all situations, 
but in given situations certain types of operations are more appropriate 
than others. The classification presented indicates the arbitrariness of 
the a priori methods which are widely used in certain operations. Such 
methods hold claim to speed of construction and apparent ease of scale 
use, but few studies have demonstrated satisfactory reliability and valid- 
ity for such scales. Insufficient use has been made of the more precise 
psychophysical and item analysis techniques which are available to scale 
constructors. 

Since a critical evaluation of each type of rating device may be made 
from an examination of the operations listed in Table 1, the following 
discussion will be limited to three frequently encountered situations in 
which an employee evaluation or rating is desired. 

(1) The situation in which the results of the rating will be used as 
a criterion (a) for the standardization and validation of an industrial 
aptitude test battery, or (b) upon which promotions and transfers of 
employees will be based. In both of these instances the rating results are 
assumed to be as precise and objective as possible and hence the rating 
instrument should possess a high degree of reliability and validity. To 
this end several requirements should be satisfied: 

(a) Every effort must be made to reduce the variability of the raters’ 
individual interpretations of the given trait or behavior description. It 
seems desirable for each item to be a short description of on-the-job be- 
havior or a concrete example of inter-personal relations with fellow em- 
ployees or superiors. Each item should contain only one thought or 
description and should not require the rater to interpret the meanings 
of vague adjectives and generalized descriptions; a given scale item should 
have approximately the same meaning for all the raters. 

(b) The actual judgment task required of the raters should be as 
simple and straightforward as possible. Errors of human judgment 
are reduced if the rater is operating in a limited and well defined situation 
such as (1) being required to place a check in front of those statements 
which specifically describe the ratee, or (2) determining which one of five 
statements best describes and which one least describes the ratee. 

(c) The common judgment errors such as halo effect, the end effect 
and the error of central tendency should be kept to an absolute minimum. 
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One effeetive method of reducing these errors is to use a format which 
(1) prevents the rater from controlling or forcing the final results of his 
ratings, and (2) hides or disguises the true scoring weight of the various 
items or the points along the scale continuum. 

(d) The method used should provide for the objective construction of 
two or more equivalent forms of the same scale or check list because the 
equivalent form method furnishes an excellent measure of the reliability of 
the device. 

(e) The items in the final scale should be selected by pretesting the 
original items so that only the more discriminating items remain in the 
final, shorter rating device. 

(f) The method of determining the scoring weights of the traits or 
items should be objective. These weights should be determined either 
from (1) the mean opinions of a number of “‘experts,’”’ with provision to 
discard items upon which there is low agreement between judges, or (2) 
the discrimination value of each item, based on the results obtained from 
criterion sub-groups of ratees. 

(g) An unfortunate but necessary sequel to the above conditions is 
that considerable time and money must be available for the construction 
of the rating device. There is no short-cut to high reliability and 
validity! 

The following types of rating devices satisfy the above criteria: the 
weighted, random check list and the forced choice technique. The 
technique of Ferguson (3) also satisfies the majority of the above criteria 
although it probably necessitates preliminary training of the raters for 
best results. Both the forced choice technique and Ferguson’s method 
are restricted to situations in which an external criterion is available and 
the number of ratees on a given job is large. 

(2) The situation in which a roughly systematic employee evaluation 
method is desired and the precision of the measuring instrument is not 
of prime importance. A secondary requirement of such situations often 
is that the results of the rating be capable of explanation to the employee 
by the rater-supervisor in order that the ratee may correct his errors and 
learn his standing in his department. Such a rating device usually must 
meet the following requirements. 

(a) It must be constructed in a short time and may require little 
financial expenditure or technical knowledge of scale construction on the 
part of the scale maker. 

(b) It must be capable of clear and concise explanation to the em- 
ployee who has been rated and should indicate specific factors that may 
be used in employee training. 

(c) As a necessary sequel to the above, it is necessary to train the 
raters how to use the rating device, how to avoid the common rating er- 
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rors and how to effectively present the rating results to the rated em- 
ployee. 

The graphic rating scale combined with the behaviorgram adequately 
fulfills these requirements. The behaviorgram forces the rater to give the 
reasons for his ratings anil enables the rate: to learn why he received a 
given rating on each trait or scale. A refinement in scoring may also be 
effected by converting the raw scores on each trait into standard scores. 

(3) The situation in which over-all fellow-employee ratings are de- 
sired. Such ratings may be useful as a measure of the morale or social 
and work relationships in a department and they may also be used as 
a basis for dividing individuals into criterion sub-groups for the subse- 
quent standardization and validation of rating scale items. Each em- 
ployee is required to rate each of his fellow-employees on over-all job 
ability and/or personality factors and then the mean rank standing of 
each employee is computed. A large number of individuals in approxi- 
mately the same type of job must be well acquainted with each other, 
or a number of different supervisors must be able to rate each employee, 
for the precision of such a criterion measure is largely a function of the 
number of raters ranking each individual. 

The most suitable devices satisfying these criteria are the rank order 
method, the paired comparisons method and the linear or graphic scale. 
The third method is more practical when the number of ratees is large, 
but the paired comparisons method may also be applied to a relatively 
large number of ratees if the method of Uhrbrock and Richardson (14) 
is used. In most instances the linear or graphic method may not furnish 
as precise results as the rank order or paired comparisons techniques. 


Summary and Conclusion 


An attempt has been made to classify the personnel rating methods 
which involve human judgment. The classification was made on the 
basis of (1) the operations performed by the experimenter or scale maker 
in constructing the scale; (2) the operations performed by the judges or 
raters when applying the rating device to a given individual; and (3) 
the operations involved in constructing a scoring method for the rating 
device. In the light of this classification it is possible to evaluate the 
various rating methods and understand the assumptions underlying their 
construction and use. The evaluation of these methods has been oriented 
around three types of industrial situations where human judgments of 
other individuals are required. The discussion and classification of the 
rating methods has been directed primarily toward furthering progress 
in this area of industrial personnel psychology which has thus far failed 
to benefit fully from available psychometric techniques. 


Received February 19, 1947. 
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Confusion Control in Poster Readership Study 


Charles L. Bigelow 
Facts Consolidated, Los Angeles, California 


Another chapter may now be written in the development of a research 
technique first described in these pages in 1940.! As originally outlined 
by Darrell B. Lucas, the method was designed to eliminate false identifi- 
cation in recognition studies of magazine advertisements. Dr. Lucas was 
then in charge of New York University’s continuing study of advertising 
readership in four weekly magazines. Each week, he equipped his inter- 
viewers with scrapbooks made up partly of advertisements published the 
week before and partly of advertisements scheduled to appear in the 
following week. Rotating the content of the scrapbooks from week to 
week, Dr. Lucas obtained a pre-publication and a post-publication rating 
for every advertisement studied. Corrected ratings were computed by a 
formula that will be discussed below. 

Students of the printed page have generally agreed that recollection 
of readership has greater validity than simple recognition, however care- 
fully recognition may be controlled. As a consequence, most studies 
of magazine and newspaper readership today are based upon the recall- 
recognition of respondents examining the material in its context, that is, 
in an actual copy of the issue under study. While it is not in general 
use in the medium for which it was intended, therefore, the Lucas tech- 
nique had been adapted to other media in which the study of advertise- 
ments in their context is not feasible. It is employed in the study of car- 
card readership,? and has even been used to control a recognition study 
of radio spot announcements.* 

Most recent adaptation of the Lucas technique has been to the study 
of outdoor advertising. The present paper is based upon experience 
gained in conducting readership research for Foster and Kleiser com- 
pany,‘ although the same basic approach has been employed in another 
study of poster readership.’ It should be explained that the sole purpose 

1D. B. Lucas. A rigid technique for measuring the impression values of specific 
magazine advertisements. J. appl. Psychol., 1940, 24, 778-790. 

2 Advertising Research Foundation. Continuing study of transportation advertising, 
1944 to date. Dr. Lucas is the Foundation’s technical director. 

* Edward Petry & Co. What radio research forgot. 1946. 


4 Foster and Kleiser Co. A study of outdoor advertising in Sacramento. 1946. 
5 Traffic Audit Bureau. Methods for the evaluation of outdoor advertising. 1947. 


626 











Confusion Control in Poster Readership Study 627 


of using the Lucas technique in the Foster and Kleiser study was to con- 
trol confusion, or false identification made in good faith. Interviewer 
bias and lack of candor on the part of respondents, two other distortions 
which the Lucas technique seeks to eliminate, were already virtually pre- 
cluded by the mechanics of the interview and the nature of the posters 
selected for study. 


The Sacramento Study 


Foster and Kleiser Company was anxious to obtain readership ratings 
for a group of typical posters, studied under normal conditions and con- 
sequently at the expense of comparability. Posters ranged in intensity 
of showing from 50 to 100 per cent,® and advertisers ranged in continuity 
of campaign from some having had displays up in Sacramento (which 
was selected as the test market) in each of the six months preceding the 
study to some who had not previously used the medium. Posters were 
selected without consulting the advertisers, who were notified of the 
study too late to permit copy changes. 

Readership ratings were developed for 20 posters in all. Interviews 
were conducted in May and July, 1946, with two samples, each of approxi- 
mately 500 persons and each obtained by the same rigid “systematic 
random” sampling technique. The May sample was shown full-color 
reproductions of eight posters displayed in April and 12 scheduled for 
June. The July sample was shown the same 12 posters, which by then 
had been displayed for 30 days and “‘covered”’ (i.e. removed from sight), 
as well as ten other posters which had never been displayed in Sacramento. 
Six of the latter had been posted locally in distant West Coast cities, and 
four were fictitious designs prepared by Foster and Kleiser Company 
artists. 

Apart from their use in studying confusion, the reason for including 
poster designs not previously displayed in the test market was to dis- 
courage guess-work on the part of respondents by the creation of what 
has been called ‘‘an atmosphere of semi-familiarity.’” 

Alleged recognition of the 12 posters studied before and after posting 
is shown in Table 1. Although, as already noted, accomplishments of 
individual posters are not comparable, the table is felt to picture the 
relationship that may normally be expected to exist between pre-posting 
and post-posting ratings. 


6 This refers to the number of panels on which a poster is displayed. A 100 per cent 
intensity on Foster and Kleiser boards in Sacramento at the time of the study consisted 
of 16 panels, half of which were illuminated. 

7 Acknowledgment is made to the Continuing Study of Transportation Advertising. 

§ Pre-posting recognition ratings are here published for the first time by permission 
of Foster and Kleiser Company. 
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Table 1 
Ratings of Posters Before and After Posting 
Rating (Per Cent 
Recognizing) 
Before After 

Poster Posting Posting 
Acme Beer 36 68 
Pontiac 36 57 
Kellogg’s Corn Flakes 34 47 
Par-T-Pak Beverages 34 47 
Squirt 27 47 
Dodge 18 33 
Nash 8 31 
General Tire 18 30 
Hunt’s Tomato Sauce 12 28 
Dennison’s Chili Ve 25 
Bank of America 17 20 
Mercury 9 16 





Rank order correlation coefficient = +0.86. 


It is believed that an explanation of the high rank order correlation 
may be found in the nature of the medium itself. The average poster 
is read by persons who are in motion and have only a few seconds for 
observation. It is therefore designed for rapid identification, and this 
is frequently achieved by the use of well-known trade marks and logo- 
types and illustrations of well-known packages. Copy is held to a mini- 
mum. The design is posted on a number of panels which are so located 
that a person doing considerable traveling about the city may observe 
it several times in the course of a single day. It remains posted for 30 
days, and it is frequently replaced by another poster of the same ad- 
vertiser and of very similar design. In this way, the advertiser takes 
full advantage of the first poster’s readership and achieves a sort of cu- 
mulative recognition of the second. 

In short, outdoor advertisers tend generally to foster public famili- 
arity with their posters through consistency of design and continuity of 
campaign. To the extent that they are successful, this familiarity will 
manifest itself in a relatively large readership of subsequent posters and 
in a correspondingly high false identification of subsequent posters, should 
these be studied prior to their posting. 


Analysis of Three Posters 


The effect of familiarity can be brought into sharper focus by ex- 
amining more closely the posters of three advertisers—Nash, General 
Tire and Kellogg’s Corn Flakes—which represent varying degrees of 
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consistency in design. While these findings cannot be accepted as con- 
clusive, they were regarded as sufficiently indicative to govern policy 
with respect to confusion control in the Foster and Kleiser study. 

The Nash poster was one of a series which, although following a con- 
sistent copy theme, differed sharply from month to month in illustrative 
content and manner of deriving headlines. As a result, and despite the 
fact that four other Nash posters had been shown in Sacramento in the 
previous six months, this design upset the rank order correlation by re- 
ceiving the lowest false (pre-posting) recognition of any of the 12, while 
ranking seventh in recognition after posting. 

The General Tire design was one of a series which varied less in ap- 
pearance, but which lacked the sharp brand identity that characterized 
the poster copy of certain other advertisers. This poster’s pre-posting 
and post-posting ratings were in the ratio of three to five, almost exactly 
that of the 12 posters together, which had an average recognition before 
and after posting of 22 and 37 per cent, respectively. 

The Kellogg’s Corn Flakes poster typified those designed to build up 
maximum familiarity by consistency of layout and the picture of’a well- 
known package. The Kellogg poster was one of four to be falsely iden- 
tified prior to posting by more than one-third of the sample. 

A further illustration of the relationship existing between ratings 
obtained before and after posting is presented in Table 2, which shows 
variation by age group in recognition of fhe same three posters. 


Table 2 
Variations by Age Group in Ratings (Per Cent Recognition) of Three Selected Posters 




















Kellogg’s Corn: 
Nash General Tire Flakes 
Before After Before After Before After 
Age Group Posting Posting Posting Posting Posting Posting 
Total 8 31 18 30 34 47 
14-19 3 56 16 28 50 59 
20-34 7 36 19 33 31 58 
35-49 11 26 20 33 31 37 
50 and over 10 21 16 24 33 40 





Young people were highest in recognition of the Nash Poster after its 
posting, but lowest in false identification prior to posting. This is the 
relationship to be expected in the case of a poster so different in design 
from earlier posters of the series that it may be termed unfamiliar. Those 
to whom such a poster is addressed * and who are most interested in its 


®*The Nash poster carried the headline, “Class Leader,” and was illustrated by a 
picture of a high school senior, laden with trophies; it was displayed in the month of 


graduation. 
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content will be the most convinced, prior to posting, that they have not 
seen it, and will be the most aware of it after posting. 

Recognition of the Kellogg’s Corn Flakes poster illustrates the reverse 
relationship that may be expected in the case of a familiar design. Here 
the group to whom it is addressed will be the most prone to recognize 
it erroneously prior to posting, since this same group has paid the most 
attention to earlier posters of the same series and is therefore most subject 
to confusion. 


Application of the Lucas Formula 


Dr. Lucas developed a formula for arriving at “‘controlled recognition” 
ratings of magazine advertisements studied out of their context. To 
understand the formula, it should be remembered that a magazine adver- 
tisement (unlike a poster design) is written with the knowledge that the 
reader, if he looks at it at all, will continue to do so for as long as it is 
able to hold his interest. Below its headline and illustration, there are 
often 100 words or more of copy. A magazine advertisement is seldom 
“read” in its entirety more than once, and it is ““observed”’ (i.e. registers 
a repeat impression) a very limited number of times. If its general layout 
resembles that of other advertisements in the same series, there is a strong 
possibility that a readership respondent who has not seen it will confuse 
it with another which he did see, thereby making a false identification. 

Dr. Lucas reasoned that the proportion of the sample subject to con- 
fusion when the advertisement was studied after publication would ap- 
proximate the proportion of confused respondents measured by pre-publi- 
cation study. Thus, in the Lucas formula, the pre-publication rating 
(expressed as a percentage of the sample) becomes the proportion of 
“unreliable” respondents and is deducted both from the post-publication 
rating and from the base: 


post-pub. less pre-pub. 
100% less pre-pub. 





Corrected recognition = 


Research counsel for Foster and Kleiser Company took the position 
that Dr. Lucas’ reasoning obtained chiefly in the case of advertisements 
that were uniformly unfamiliar to respondents. When applied to posters 
studied under normal conditions, the formula appeared to treat famili- 
arity as evidence of “unreliability,” thus placing a penalty on the con- 
sistency of design and continuity of campaign which had been the means 
of achieving familiarity.°. Application of the formula to the ratings of 


In a letter to William F. Fielder of Fielder, Sorenson and Davis, April 2, 1947, 
released in connection with Study No. 8 of the Continuing study of transportation adver- 
tising, Dr. Lucas has agreed that “the pre-examination score is roughly a measure of 
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the three posters discussed above appears to confirm this position (see 
Table 3). 


Table 3 
Application of the Lucas Formula to the Ratings of Selected Posters 








Per Cent 
Rating (Per Cent Recognizing) Reduction 
— Resulting 
Before After After from Cor- 

Poster Posting Posting Correction rection 








Nash 8 31 25 19 
General Tire 18 30 15 50 
Kellogg’s Corn Flakes 34 47 20 57 





Alternate Corrections 


There could be no doubt, on the other hand, that an unknown number 
of “unreliable” respondents were contributing to the positive recognition 
of posters after posting. Seeking some evidence of “‘unreliability”’ that 
would not involve familiarity, the research firm turned to the positive 
identification of fictitious and out-of-town designs, shown to the July 
sample. None of these received a false identification of more than six per 
cent. Mean recognition of fictitious posters was 3.3 per cent, and of the 
out-of-town posters, 3.5 percent. In the Foster and Kleiser study, there- 
fore, 3.4 per cent of the response was regarded as “unreliable,”’ and con- 
fusion correction consisted of reporting only 96.6 per cent of the post- 
posting recognition rating developed for each poster." 

There is good reason to believe, however, that every advertisement 
has about it certain characteristics that will lead some respondents to 
associate it with something else they have seen somewhere before. It 
would seem, moreover, that advertisements differed in the amount of 
confusion they generated. A “pretty girl’ advertisement, for example, 
will probably create more confusion than one featuring a less hackneyed 
illustration, although it may also attract more readers. Provision of a 
separate correction factor for every advertisement studied is one of the 
most important aspects of Dr. Lucas’ method. Conversely, the chief 
weakness of the correction described above was its blanket nature. 
the accumulated audience of those familiar campaign features which are repeated in 
each advertisement.’”’ He explains that application of the formula then yields “the 
number of people who have been impressed with some new or distinct feature which is 
carried exclusively in the measured advertisement.” When an advertisement has no 
new or distinctive features, however, Dr. Lucas grants that his technique “leaves the 
local advertising operator without any exact basis for estimating the total net audience”’ 
which may have seen the advertisement in question. 


4 The same correction factor, developed in July, was also used to correct recognition 
of the eight designs posted in April and studied in May. 
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Experience gained in conducting the Sacramento study points to a 
relatively simple modification of the Lucas method for the purposes of 
outdoor readership study. This modification would be to substitute for 
the pre-posting study a concurrent out-of-town study. In practice, there 
would actually be two test cities. Interviewers’ kits would be made up 
half of designs which had been displayed in City “A” but- never in City 
“‘B”’, and half of designs which had been displayed in City “B’’ but never 
in City “A”. Interviewing would take place in the two cities simulta- 
neously, and false identification of each poster in the city in which it had 
not been posted would be used to correct the rating obtained for the sare 
poster in the city in which it had been posted. 

The proposed modification might be valuable in detecting “‘confusion’”’ 
—in the popular sense of distraction, befuddlement, or uncertainty—as 
it affected the recognition of various posters in varying degrees. It 
would not destroy or discount the advantage of familiarity, but it would 
fail to correct for two types of confusion in the strict sense: confusion with 
earlier posters in the same campaign, and confusion with previous and 
current advertising of the same advertiser in other media. 

In the outdoor medium, the danger of confusion with earlier posters of 
the same campaign is believed to be negligible. Posters are displayed 
for 30 days and (barring severe illness, prolonged absence from the city, 
or drastic change in travel habits) a respondent, who has seen an earlier 
poster enough times to remember it, will probably also have seen the 
design last displayed. 

Confusion of media, on the other hand, presents a very serious threat 
to the validity of readership ratings in outdoor advertising and elsewhere. 
How the problem can best be solved will depend largely on the direction 
taken by poster research in the future. One solution, of course, would be 
to limit study to designs not appearing in other media, but such a policy 
would substitute arbitrary specifications for the normal conditions under 
which poster readership has hitherto been studied. And in the same 
way that, continuity of campaign makes for familiarity, advertising in 
other media also conditions the public for rapid recognition of poster 
designs. 


Conclusions 


While the Foster and Kleiser study seemed to show that pre-posting 
ratings were not suitable for the control of confusion in outdoor reader- 
ship research, it also suggested that they had another value. If adver- 
tisers come to acknowledge the advantage of familiarity in obtaining in- 
stant identification of their posters, they can find few better ways of 
measuring their accomplishment in this regard than the recognition study 





Confusion Control in Poster Readership Study 633 


of designs prior to posting. Such study, however, would deal with the 
impact and penetration of posters, and would also involve frequency of 
impression. How many times, for example, does the eye “‘see’’ a poster 
before it transmits an advertising impression to the brain? How many 
of the poster’s 30 days are spent in winning recognition? It may be that 
further adaptations of Dr. Lucas’ methods will lead into these still un- 
explored fields of marketing research. 

Received August 3, 1947. 
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An Efficient Method of Obtaining Counts for Computing 
the Interrelation of Test Items 


George E. Mount 
University of California at Los Angeles 


The complete analysis of objective tests during construction and 
standardization requires investigation of the individual test items. Such 
investigation, termed item analysis, involves a computation of the re- 
lation of each item to some criterion. 

It is generally recognized and stated that the analysis makes certain 
assumptions concerning the interrelation of the test items. Items or 
groups of items closely related do not contribute to the test in the same 
manner as items which are not related. Because of the prohibitive 
amount of labor necessary in computing item intercorrelations it has been 
commonly assumed that the interrelation of items is negligible or ap- 
proximately equal. That this .ssumption is often not justified is shown 
by the failure of many attempts to improve test discrimination by use of 
weights based on a simple relation of each item to the criterion. 

It is the purpose of this paper to describe an efficient method of ob- 
taining the counts required for computation of the interrelation of the 
test items, using the Graphic Item Counter of the IBM Test Scoring 
Machine. Making these counts has constituted a la:xe portion of the 
total labor involved and it is hoped that the method will encourage test 
standardization based on a more complete analysis of the items. 

To fill the cells of a 2 X 2 table, four independent counts are re- 
quired as shown in Table 1. Here the number answering both items 1 
and 2 correctly is d.signated as “b,’”’ the number answering both items 
1 and 2 incorrectly as ‘“‘c,”’ the number answering item 1 correctly, but 2 
incorrectly as ‘‘a2’”’ and the number answering item 1 incorrectly but 2 
correctly as “d.’”’ The counts most easily obtained by the method to be 
described are: the total number of cases, (N); the number of correct 
answers on item 1, (a + b); the corresponding count for item 2, (b + d) 
and the number of cases with correct answers on both items, (b). These 
counts are illustrated in Table 1. The counts obtained directly are then 
translated into percentages for computation of tetrachoric correlations 
with the Thurstone diagrams (1) or are used to obtain the counts corre- 
sponding to other cells in the 2 X 2 table for application to other methods 
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Table 1 
2 X 2 Table for Computing Item Interrelation 
Item 2 Item 2 
Incorrect Answers Correct Answers 
Item 1 
Correct) Answers Count a Count b a+b 
Item 1 
Incorrect Answers Count c Count d c+d 


ate b+d N 





of computing the tetrachoric coefficient (4), the phi coefficient (3), or to 
tests of significance (2, 5). 

The method requires the use of an International Business Machines 
Test Scoring machine equipped with Graphic Item Counter. This neces- 
sitates marking the correct answers of the original data on IBM answer 
sheets. No mark is needed for items answered incorrectly. If the test 
being studied has been administered using IBM answer sheets, these 
may be used without modification, provided they are well marked, and 
provided that the machine is set to count correct answers only. The 
method is readily applied to any data which can be dichotomized 
and where the number of variables makes hand counting inefficient. 
For this purpose each variable is assigned to one item position and the plus 
cases for each variable are recorded on the answer forms in the same 
manner as for test questions. 

The scoring machine is first set up to count the number of correct 
answers for each test question. These are the counts a + b and b + d. 
The unit will record up to 115 test answer sheets for 90 questions on a 
single run. Following this a count is made of the answer sheets with 
correct answers on two questions. This is done taking each item in turn 
with every other until all combinations of two have been counted. These 
are the b counts, obtained by wiring two items to the same counter and 
setting the machine to count only those answer sheets with the correct 
answer on both items.' Instructions concerning the mechanics of making 
the counts are contained in the standard instruction manual accompany- 
ing the scoring machine. A description of the operation of this counter 
is given by McNamara and Weitzman (6). 

The counts can be made in the fewest possible runs only if the item 
positions are wired together in a particular pattern each time a run is 
made. Table 2 illustrates the correct pattern for each run for tests with 


1 This is accomplished with the use of a “commoner sheet”’ and the “multiple re- 
sponse”’ switch located on the scoring machine. 
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Table 2 
Wiring Pattern for Item Combinations * 

























Item Nine [tem Table for Obtaining Run Item Wiring Connections 
Number Item Wiring Connections _ Number Based on an Eight Item Table 





9 12:64:88 7:82 1 1&8, 2&7, 3&6, 4&5 
8 9 13 2 6 e364. =z 2 1&7, 2&6, 3&5, 4&8t 
7 89128384: 3 1&6, 2&5, 3&4, 7&8 
6 7 US. 2-35 4 1&5, 2&4, 6&8 3&7t 
5 ST @ t= 5 1&4, 2&3, 5&8 6&7 
4 5 6 7 X Run numbers 6 1&3, 4&8, 5&7, 2&6f 
3 45X are shown in 7 1&2, 3&8, 4&7, 5&6 
2 3 xX the upper half 8 1&5, 2&8, 3&7, 4&6 
1 x of the Table 

13 2 Aes. 7.85 

Item Number 





* The method of wiring is illustrated in two forms to make the general wiring method 
t For an even number of items it is possible to duplicate certain counts to make a 
spot check of accuracy. This is done by wiring the unused items for each of the even 
numbered runs. In the eight item example this is for item connections 4 & 8, 3 & 7, 
2&6,1&5. These duplicate connections are made in runs 6, 8, 2 and 4 respectively. 


odd or even number of questions. The same principles will apply up to 
180, the maximum number of questions that can be handled. The order 
of wiring is such that the counts can be easily recorded. 

The accuracy of this method is nearly perfect provided the following 
conditions are met: (a) The marks on the answer forms must be reason- 
ably heavy. (A good mark is reliable for about 40 machine runs.) (b) 
The sensing unit of the scoring machine must be clean. Even under 
conditions with considerable dust, it has been found in actual use that 
15,000 papers can be run reliably before surfaces of sensing contacts ure 
filed. With optimum conditions this number would probably be greater. 
If absolute accuracy is required, two separate runs for each count can be 
made, on the second run the number of cases in which a mark has been 
made in at least one of two item positions is obtained. This count 
subtracted from the total number of cases (N) will give count “‘c’’. 

The amount of time necessary to obtain the required counts is approxi- 
mately 12-20 minutes per machine run, depending on the time necessary 
to wire the plug board. The number of machine runs will depend on the 
number of variables and the number of cases. For each group of 115 
papers it is equal to the number of variables plus one for a number of 
variables 90 or less, plus two for a number of variables 91 to i80. For 


* This is accomplished with the “‘multiple response” switch in the off position. 
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a study involving 500 cases with 40 variables this was found to be approxi- 
mately 50 man hours. The amount of time required to accomplish the 
same result by hand would be well over 1000 man hours net, assuming 
two men did the tabulating. Machine counting would not be appreciably 
speeded by more than a single person doing the work. 


Received January 13, 1947. 
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A Rapid Method of Computing Standard Scores 


Bert R. Sappenfield 
Montana State University 


There are many occasions in work with the normative data of psy- 
chological tests when it is desirable to compute standard scores. This 
is especially true since there is little evidence that authors of tests agree 
on what kind of data should be supplied in test manuals. Normative 
data are sometimes presented in the form of standard scores, though more 
frequently such data is in the form of centiles, I.Q. equivalents, or M.A. 
equivalents. For purposes of inter-test comparison or combination, 
standard scores in some form beceme practical necessities. Whenever it 
is possible to obtain, directly or indirectly, the mean and standard devia- 
tion for the standardization group, standard scores may be computed. 
It is with a rapid method of performing such computations that this 
paper will be concerned. 

The formulae for five commonly used types of standard scores may 
be presented in the following systematic form: : 


(1) z-score = (1/c) (X—M) + 0 

(2) SSox = (1/c) (X-—M) + 5 

(3) T-Score = (10/c) (X—M) +50 
(4) Hull-Score = (14/0) (X—M) + 50 
(5) SSco020) = (20/0) (X—M) + 100 


In the present discussion, the first term in the formulae (e.g., 1/o¢ in For- 
mula 1) will be referred to as the “‘rate,” since this is the rate of increase in 
standard scores for each unit of increase in raw scores. The dividend of 
this term is equivalent to the number of units per standard deviation in 
the standard score scale. The value occurring after the plus-sign desig- 
nates the mean of the standard scores. 

The following steps are recommended for obtaining a set of standard 
scores equivalent to each of the raw scores in a distribution: 


1 For example, when it is desired to convert centiles into standard scores, a reasonable 
approximation of the mean and standard deviation may be obtained through the assump- 
tion that scores for the standardization group were normally distributed. The mean 
may then be taken as equal to the fiftieth centile and the standard deviation as equal 
to Px — Pss/2 X .6745. 


638 





A Rapid Method of Computing Standard Scores 


Table 1 
T-Scores (Entries) Corresponding to Raw Scores 20 to 49. (M = 35; « = 5) 








0 1 2 3 4 5 6 7 8 9 


40 60 62 64 66 68 70 72 74 76 78 
30 40 42 44 46 48 50 52 54 56 58 
20 . 20 22 24 26 28 30 32 34 36 38 








Bottom row contains cells for T-scores corresponding to raw scores 20 to 29, inclu- 
sive; etc. 

Rate = 10/¢ = 10/5 = 2.0. 

T-score for raw score 20 = 2.0 (20-35) + 50 = — 30 + 50 = 20. 

T-score for raw score 49 = 2.0 (49-35) + 50 = 28 + 50 = 78. 


1. Prepare a table, such as Table 1, for entry of standard scores. In 
this table the arrangement is in terms of raw scores, with rows repre- 
senting the tens and columns representing the units. Standard scores 
are entered in cells at the row-column intersection representative of a 
given raw score (e.g., raw score 33 is represented in row 30, column 3). 

2. Compute the “‘rate’”’ by performing the division indicated by the 
first term of the formula (e.g., 10/o¢ in Formula 3). Retain four or five 
decimal places. 

3. Find the standard score corresponding to the lowest raw score. 
Let X equal the lowest raw score and M equal the mean of the raw scores. 
Multiply (X—M) by “rate” found in (2) and add the value indicated 
in the formula (mean of standard scores). Enter this result in the cell 
of the table corresponding to the lowest raw score. 

4. To obtain each successive standard score, add the “rate’’ to the 
value already reached, and enter each new value in the proper cell of the 
table. Continue this process of accumulating the “rate” until the table 
is filled. 

5. Check the entire table by multiplying the “rate’”’ by (X—M) and 
adding the value required by formula, letting X equal the highest raw 
score. If this result checks with the last standard score entered, the en- 
tire table can be considered correct. 

When a calculator is used, the procedure here described can be ac- 
complished with extreme rapidity. It will be noted that only one divi- 
sion operation and two multiplication operations are required; the re- 
mainder of the procedure involves addition. 


Received January 20, 1947. 
































An Experiment on the Design of Tables and Graphs 
Used for Presenting Numerical Data * 


Launor F. Carter 
University of Rochester 


The purpose of this study was to determine the relative effectiveness 
of different techniques for presenting numerical functions. In the Air 
Forces, flight personnel must be able to determine quickly and accurately 
a dependent variable (the answer) from a function when they have ob- 
tained the independent variable (the argument) from their flight instru- 
ments. This problem is common to all engineering and science. There 
are generally two techniques available for presenting such functions in 
readily usable form; namely, by tables and graphs. But it is not im- 
mediately evident which type of presentation can be most rapidly and 
accurately used. Not only is there no evidence as to the conditions under 
which tables or graphs are most useful, but there is little evidence as to 
the optimum construction or form of tables or graphs. 

The question of the best method for presenting numerical data has 
been long recognized and many a priori rules have been formulated. In 
1915 a Joint Committee on Standards for Graphic Presentation (3) pub- 
lished a list of suggestions to be followed in presenting statistical and 
quantitative data in graphic form. Many statistical texts and books on 
graphic analysis list rules for designing tables and graphs; for instance, 
Wrothing and Geffner (5) in their Treatment of experimental data list a 
number of rules. 

There has been some experimental work on graphic presentation. In 
1932 Croxton and Stein (2) investigated the relative accuracy with which 
bars, squares, circles, and cubes could be judged when used ia graphic pres- 
entation. They found that bars were judged most accurately. Graham 
(4) has investigated the influence of such variables as the relative desira- 
bility of presenting bars horizontally or vertically, the influence of spacing 
between the bars, the width of the bar, etc. 

In the study reported here four different problems were investigated. 
First, an attempt was made to determine what influence increasing the 
number of points included in a table (which increases the number of 
pages) would have on the accuracy and speed with which the table could 
be used. Second, the influence of the frequency of coordinate lines or 
rulings in a graph on the accuracy and speed with which the graph could 

* This study was completed while the author was a member of the staff of the Psy- 
chology Branch, Aero Medical Laboratory, Wright Field, Dayton, Ohio. 
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be used was investigated. Third, whether it was faster and more ac- 
curate to enter a graph on the x or y-axis was studied. And finally an 
effort was made to determine whether, in general, it was better to use the 
best designed table as compared to the best designed graph. 


Procedure 


In a previous experiment (1) it was shown that when the values used 
as arguments are tabulated in a table it is more efficient, in terms of both 
the speed and accuracy with which an individual can determine the re- 
quired information, to present the data in tabular rather than graphical 
form. But when the values used as arguments are non-tabulated values; 
i.e., when the required information must be obtained from the table by 
interpolation, it is more efficient in terms of speed and just as efficient in 
terms of accuracy, to present the material in graphic rather than in tabu- 
larform. Since in one case the table is superior and in the other the graph 
is superior, the question is raised as to whether or not by the proper de- 
sign of a graph or table it would be possible to develop a method of presen- 
tation which would be superior for all types of argument. 

To investigate this problem and the others mentioned in the intro- 
duction three different tables and four graphs were prepared. All of the 


2 
tables and graphs represented the function Y = = where C = 60, 70, 


C 
80, 90 and 100. The tables and graphs were reproduced by a photo- 
offset process. Figure 1 shows typical portions of this material, a de- 
scription of which follows: 


Table V in this experiment was identical with Table IV of the previous 
experiment. This table is the base table to which all other tables and 
graphs in this experiment will be referred. It will be noted that every 
fifth point in the major argument and every tenth point in the minor argu- 
ment were tabulated, and the complete table covered one page. When- 
ever in this experiment the statement is made that a particular argument 
required interpolation to secure the answer, it should be understood to 
mean that the argument would require interpolation if Table V were used. 
As will be seen such an argument might not require interpolation if a more 
extensive table were used. 

Table VI was constructed in the same manner as Table V except that 
every other point in both the major and minor arguments was included in 
the table. This table covered four pages which were stapled together 
in a booklet. In the upper right hand corner of each page was a notation 
giving the range of values of the major and minor arguments covered on 
that page. 

Table VII was constructed in the same manner as Table VI except 
that every whole numbered point for both the major and minor argu- 
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ments were included. This table covered twelve pages which were sta- 
pled together in a booklet. 

Graph V in this experiment was identical with Graph IV of the pre- 
vious experiment. This graph was drawn on paper having 20 x 20 
rulings-to-the-inch. (In the process of reproduction this and the follow- 
ing graphs were reduced slightly, one inch in the originals being represented 
by nine-tenths inch in the reproductions.) 

Graph VI was similar to Graph V except that it was drawn on 8 X 8 
rulings-to-the-inch graph paper. 

Graph VII was similar to Graph VI except that it was drawn on 4 X 4 
rulings-to-the-inch graph paper. 

Graph VIII was similar to Graph V except that it was designed for 
entry on the y-axis rather than on the x-axis. 


Answer sheets having problems which required the subjects to deter- 
mine an answer when they were given the major and minor arguments 
were constructed. The subjects were required to solve problems which, 
when referred to Table V, involved tabulated arguments, involved simple 
interpolation for either the major or minor arguments, and which required 
double interpolation. As far as possible the subjects solved the same 
items on each graph or table although the order of the items was always 
changed so that the subjects would not recognize that they were solving 
the same problems several times. The material was given in seven differ- 
ent orders so that each table or graph appeared first for one seventh of 
the subjects, second for another seventh, etc. However, the sequence in 
which the tables and graphs were given was not varied. This sequence 
was: Table V, Graph VII, Graph VIII, Table VII, Graph V, Table VI, 
Graph VI. The subjects were read instructions which described each 
table or graph and explained how it was to be used. If the table required 
interpolation a short description of how to interpolate was given. For 
each table and graph the subjects were allowed to work two minutes on a 
series of problems for which the arguments were tabultaed and for three 
minutes on a series of problems requiring either single or double inter- 
polation. The tota! working time for all the material was 56 minutes. 
The materiai was taken by sixty-eight male science students at Miami 
University... The final data are based on the results of 15 freshmen, 29 
sophomores, 11 juniors, 8 seniors and 7 college graduates.” 

1 The author is greatly indebted to Dr. Clark Crannell for administering the material 
at Miami University. 

* The data for four of the Miami University subjects were eliminated from this study 
since their answers on one or more of the tables or graphs were so radically wrong as to 
indicate that they completely misunderstood the task involved. Data from six sub- 


jects at the Aero Medical Laboratory replaced the discarded data and increased the 
number of cases to seventy. 
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Results 


In the introduction the problems for investigation were enumerated; 
in the following paragraphs the results will be discussed relative to each 
of the four problems. The answer sheets were graded in terms of both 
the number of items completed and the magnitude of the error on each 
item and treated in the same manner as in the previous experiment. 

Before examining the results, mention of the generality of the conclu- 
sions should be made. In the first experiment it was demonstrated that 
the results held for both linear and non-linear data, and also for families 
of data and single curves. There seems to be no reason why our present 
results should not be applicable to both linear and non-linear data, nor 
why the data based on tabulated arguments should not apply to single 
sets of data as well as to the families of data used in this investigation. 
At the same time it should be clearly understood that the conclusions 
apply only to tables and graphs which are being used as a method of 
presenting functional relationships where it is desired to find the answer 
to a given argument. The results should not be extrapolated to graphs 
which are being used to demonstrate the general shape of a function, 
points of inflection, points of maxima and minima, etc. 

Before presenting the.major results mention should be made of the 
reliability of the material used. A fair measure of reliability is afforded 
by the correlation between the different tables or graphs. In a loose 
sense these correlations give a test-retest reliability althou:;h the true 
reliability will be somewhat higher than these correlations since the mate- 
rial is not exactly comparable. The correlation between tables for the 
number of items attempted range from .28 to .63 with most of the cor- 
relations above .50; similarly for the graphs the correlations range from 
42 to .78. When it is remembered that these correlations are based on 
only two or three minutes working time it would seem that the materials 
are fairly reliable as far as numbers of items completed is concerned. On 
the other hand the correlation between the magnitude of errors from table 
to table range from —.06 to .03, and for graphs from .00 to .28. As 
long as we are interested in comparing only mean scores, two distributions 
may be based on fairly unreliable measures but still differ very signifi- 
cantly, e.g., when the intrinsic magnitude of the measures are different. 

1. The first problem investigated was a determination of the influ- 
ence of increasing the number of tabulated points in a table on the speed 
and accuracy with which a table could be used. It will be remembered 
that Table V had every fifth point of the major argument and every 
tenth point of the minor argument tabulated; Table VI had every other 
point on both arguments tabulated; and Table VII had every point of 
both arguments tabulated. The tables were increased from one, to four, 
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Table 1 


Means, Standard Deviations, Correlation, and Critical Ratios Between Tables in 
Terms of the Number of Problems Completed 
N = 70 








Mean Number Standard 
Completed Deviation 


First Second First Second Corre- Critical 
Tables Table Table Table Table lation Ratio 
Tabulated Values 
Table V and Table VI 19.89 10.07 4.52 2.93 .53 21.35 
Table VI and Table VII 10.07 7.63 2.93 2.23 51 7.87 


Single Interpolation 
Table V and Table VI 6.69 11.00 2.65 3.58 59 12.31 
Table Viand Table VII 11.00 13.68 3.58 2.80 .63 7.88 


Double Interpolation 
Table V and Table VI 1.99 4.51 1.20 1.71 .28 12.00 
Table VI and Table VII 4.51 14.13 1.71 2.82 33 29.15 











te twelve pages, respectively. Table 1 shows the mean, standard devia- 
tion, correlation, and critical ratios for the different tables. It will be 
observed that as the size of a table increases the number of items com- 
pleted decreases rapidly when tabulated arguments are used. On the 


other hand as the size of a table increases to more and more pages the 
number of items completed increases if interpolation is required. Table 
2 presents the corresponding data for the error scores. From Table 2 


Table 2 


Means, Standard Deviations, Correlation and Critical Ratios Between Different 
Tables in Terms of the Magnitude of Errors 
N = 70 








Average Subjects’ Standard 
Error per Problem Deviation 


First Second First Second Corre- Critical 
Tables Table Table Table Table lation Ratio 
Tabulated Values 
Table V and Table VI 34 56 .69 2.18 — .02 81 
Table VI and Table VII 56 81 2.18 3.53 — .06 49 


Single Interpolation 
Table V and Table VI 1.52 52 2.37 1.31 01 3.13 


Table Vand Table VII. 52 36 131 117 05 76 


I 
Table’ V and Table VI 3.18 1.07 3.85 3.55 03 


Tabi VI and Table VII 1.07 31 3.55 71 
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it will be seen that the average magnitude of errors made when entering 
the different tables with tabulated values is about the same from table to 
table. However, the magnitude of the errors made when entering the 
different tables with arguments requiring interpolation decreases as 
the number of points tabulated in the table increases. 

Which table is best? Such a question cannot be answered categori- 
cally but must be answered in terms of the use to be made of each specific 
table. If a table must be used rapidly and rather large errors can be 
tolerated then it is best to use a very simple table such as Table V and 
enter it with the nearest tabulated value. On the other hand if speed 
is not of extreme importance but accuracy is essential then a table con- 
taining every point will give the best results. In general it seems that a 
table containing every point is to be preferred. since, if a wide range of 
arguments are used, it can be used as rapidly and more accurately than 
simpler tables. 

2. The second aspect of this part of the investigation was to determine 
the influence of varying the distance between coordinate rulings on the 
speed and accuracy with which graphs could be used. Graph V was 
drawn on 20 X 20 line-to-the-inch graph paper, Graph VI on 8 X 8 line- 
’ to-the-inch paper and Graph VII on 4 X 4 line-to-the-inch paper. 
Table 3 shows the means, standard deviations, correlations and 


Table 3 


Means, Standard Deviations, Correlations and Critical Ratios Between Different 
Graphs in Terms of the Number of Problems Completed 











N = 70 
Mean Number Standard 
Completed Deviation 
First Second First Second Corre- Critical 
Graphs Graph Graph Graph Graph lation Ratio 
Tabulated V aiues 
Graph V and Graph VI 9.00 9.04 2.86 2.53 A7 12 
Graph V and Graph VII 9.00 11.00 2.86 3.27 42 5.00 
Graph VIandGraph VII 9.04 11.00 2.53 3.27 51 5.60 
Single Interpolation f 
Graph V and Graph VI 12.69 11.27 4.28 3.42 62 3.46 
Graph VandGraph VII 12.69 13.43 4.28 4.43 69 1.57 
Graph Viand Graph VII 11.27 13.43 3.42 4.43 .65 5.27 
Double Interpolation 
Graph V and Graph VI 11.14 10.97 3.26 3.67 -76 59 
Graph VandGraphVII 11.14 11.43 3.26 3.78 69 85 


Graph ViandGrapk VII 10.97 11.43 3.67 3.74 .78 1.53 
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critical ratios for the number of problems completed on each graph. An 
examination of Table 3 shows that there are not marked differences in 
the number of problems completed from graph to graph although some 
of the differences are significant. Only in the problems which are tabu- 
lated, and correspondingly fall on one of the rulings, is there any marked 
increase in the number of items completed. Table 4 shows the means, 


Table 4 
Means, Standard Deviations, Correlations and Critical Ratios Between Different 
Graphs in Terms of the Magnitude of Errors 
N = 70 








Average Subjects’ Standard 
Error per Problem Deviation 





First Second First Second Corre- Critical 
Graphs Graph Graph Graph Graph lation Ratio 


Tabulated Values 

Graph V and Graph VI 3.30 2.69 5.03 4.92 .09 .76 
Graph V and Graph VII 3.30 2.50 5.03 3.81 14 1.14 
Graph ViandGraph VII 2.69 2.50 4.92 3.81 .03 -26 


Single Interpolation 

Graph V and Graph VI 2.38 2.65 1.76 2.81 14 .69 
Graph V and Graph VII 2.38 2.94 1.76 2.85 17 1.51 
Graph VIandGraph VII 2.65 2.94 281 2.85 08 63 


Double Interpolation 

Graph VandGraphVI 3.64 2.61 3.71 ‘1.49 10 2.24 
Graph V and Graph VII 3.64 3.43 3.71 2.00 17 46 
Graph VIandGraph VII 2.61 3.43 1.49 2.00 28 3.18 








standard deviations, correlations and critical ratios for the error data. 
Again it will be seen that there is little difference in the magnitude of 
error found in using the different graphs. This is somewhat surprising 
since it has been supposed that increasing the number of rulings would 
increase the accuracy with which a graph can be used. While the critical 
ratios indicate that there are some significant differences in the speed and 
accuracy with which the different graphs can be used, these differences 
do not appear to be systematic enough, nor are they large enough, to 
warrant any strong recommendations regarding the best frequency of 
coordinate rulings. At least there seems to be no reason to suppose that 
very frequent coordinate rulings improve the speed and accuracy with 
which a graph can be used. 

3. The next problem studied involved the relative speed and accuracy 
with which two similar graphs could be used when one was constructed to 
be entered on the x-axis and the other to be entered on the y-axis. It is 
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conventional in statistics to‘construct graphs with the independent vari- 
able plotted on the x-axis and the dependent variable on the y-axis. 
This usually requires that in using the graph, the eyes start at the bottom 
of the graph, sweep up to the curve and then across to the y-axis. These 
eye movements are contrary to our usual habits where we read from the 
upper left to the right and down. It was therefore hypothesized that the 
graph entered on the y-axis might be used more rapidly than one entered 
on the x-axis. Table 5 shows the pertinent comparisons. An examina- 


Table 5 


Means, Standard Deviations, Correlation and Critical Ratios Between a Graph 
Entered on the x-axis and One Entered on the y-axis 

















N = 70 
Standard 
Mean Deviation 
Graph Graph Graph Graph Corre- Critical 
V VIII V Vili lation Ratio 
Number Completed 
Tabulated Values 9.00 8.66 2.86 2.23 54 1.13 
Single Interpolation 12.69 11.23 4.28 3.38 64 3.65 
Double Interpolation 11.14 10.66 3.26 3.63 61 1.20 
Magnitude of Errors 
Tabulated Values 3.30 3.51 5.03 5.09 .00 .24 
Single Interpolation 2.38 3.35 1.76 3.69 .06 2.02 
Double Interpolation 3.64 3.04 3.71 3.04 12 1.09 





tion of Table 5 shows that there is little difference in the accuracy or 
speed with which the two graphs were used, although in the comparison 
where single interpolation was required the conventional graph is some- 
what better, it was used more rapidly and accurately than the graph en- 
tered on the y-axis. Of course past training may influence the results. 

4. The final problem studied was whether or not it is generally better 
to use the “‘best’’ designed table or the “best” designed graph. As has 
been pointed out the answer to this question depends on the use to be 
made of the table or graph. If we wish to know the medium of presen- 
tation which allows the most rapid and accurate use for all types of argu- 
ments we can compare Table VII and Graph VII. It was argued on 
page 646 that Table VII was the best, over-all table and on page 647 that 
Graph VII is as good as, and perhaps better than any other graph. Table 
6 shows the means, standard deviations, correlations and critical ratios 
between Table VII and Graph VII. Inspection of Table 6 reveals 
that in one instance the graph is used significantly faster than the table, 
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Table 6 
Means, Standard Deviations, Correlations and Critical Ratios Between the 
“Best” Table and the “Best” Graph 
N = 70 








Standard 
Mean Deviation 








Table Graph Table Graph Corre- Critical 
VII VI VII VIL lation Ratio 





Number Completed 

Tabulated Values 7.63 11.00 2.23 3.27 4 10.21 
Single Interpolation 13.68 13.43 2.80 4.43 ‘ 56 
Double Interpolation 14.13 11.43 2.82 3.74 ‘ 7.30 


Magnitude of Errors 

Tabulated Values 81 2.50 3.53 3.81 2.68 
Single Interpolation 36 2.94 1.17 2.85 6.97 
Double Interpolation 31 3.43 71 2.00 .30 12.48 





in one instance there is no apparent difference and finally the table is 
used significantly faster than the graph. However, in every case the 
table is used more accurately than the graph. It appears that, within 
the limits of the material investigated, a table in which every point is 
given is as rapid as, and more accurate to use, than any other method of 
presenting the data. 


Summary and Conclusions 


The relative effectiveness of using tables and graphs for presenting 
functional relationships was investigated by having subjects use tables 
and graphs which presented the same equation. On the basis of these 
subjects’ results and those of a previous experiment it is concluded that: 


1. The speed and accuracy with which tables can be used vary con- 
siderably with the type of construction of the table. In general a table 
containing every point is to be preferred since for most problems it can 
be used as rapidly and more accurately than simpler tables. 

2. The differences between results with different graphs are not sys- 
tematic enough, nor are they large enough, to indicate that the frequency 
of coordinate rulings are important in determining the speed and ac- 
curacy with which a graph can be used. 

3. Except for convention, it makes no difference whether a graph is 
entered on the x or y-axis. 

4. Within the limits of the material investigated, a table in which 
every point is given is es rapid, and more acurate to use, than any other 
method of presenting data. 
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5. Whenever possible tables should be reduced to the very simplest 
form and entered with the nearest tabulated arguments. If such errors 
as would arise from this procedure cannot be tolerated, then a very com- 
plete table should be used. 

6. Both AAF pilots and college students are very slow at, and make 
large errors in, interpolation. 

7. Graphs should not be used as a technique for presenting data 
unless an interpretation of the shape of the curve presented is important; 
or unless the speed and accuracy with which the graph is used are re- 
latively unimportant. 


Received December 14, 1946. 
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Relation of Personality and Character Requirements 
to Jobs in a Civil Service Agency 


Robert F. Utter 
University of California at Los Angeles 


The question ““What are the personality and character traits that are 
important in job success?”’ is a legitimate one for the student of industrial 
psychology. One way to obtain information about the matter is to study 
existing records of what large organizations are doing at the present time 
in attempting to ascertain such qualities in prospective employees. Such 
study may reveal the rationale which selects traits considered important. 
It may be that the principal service the psychologist can offer to industry 
is the pointing out of inadequacies in existing systems for selections of 
employees. One difficulty in doing such a study is that judgments of 
an applicant’s “character” and “personality” are generally made by an 
interviewer whose biases are likely to be unrecorded and selection pro- 
cedures invalidated. The published job announcements of a large civil 
service agency in the Los Angeles area present the data necessary to 
conduct such a study. 

The problem in the study was to abstract the information concerning 
personality and character requirements for a large number of jobs from 
the announcements of position vacancies. A partially complete file of 
these job announcements was available in the Psychology Department 
for the years of 1941 through 1943. A total of 425 announcements was 
studied although this does not represent that number of separate types 
of jobs. An attempt was made to secure complete files for the period 
covered but the agency does not maintain dead files of old job announce- 
ments. The agency acts as a personnel procurement agency for a num- 
ber of unincorporated municipalities. Since job announcements for 
positions in these areas are prepared by the agency they are included 
without segregation in the lists. All job announcements were listed 
serially in order of their publication. In a few cases an announcement 
superceded a previous one, but since only the date of eligibility was in- 
volved the data from the original announcements were tallied. 

The personality and character requirements for each job are listed 
in a paragraph of the announcement titled General Requirements. As 
an example, this paragraph for the job of Assistant Justice Court Clerk 
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(Janury 13, 1942) reads: “Candidates must have good clerical ability; 
write a clear, legible hand; be able to typewrite with reasonable speed and 
accuracy; have a knowledge of civil and criminal procedure in Justice 
Courts; be thoroughly familiar with legal terminology; have a knowledge 
of legal office practices and procedures and of legal documents, forms and 
reports commonly used; be accurate in handling monies; have the ability 
to keep accurate records and files; be diplomatic, reliable and trustworthy; 
and have the personality and ability to meet the public effectively and to co- 
operate with other employees of the court.”” The italicized phrases are those 
which were tallied. 

It was necessary to arrange the data so that tabulation and sum- 
marization were possible. This involved two steps. 


1. Jobs were sorted into categories of unskilled, semi-skilled, skilled, 
sub-professional and professional. 

2. A list of the requirements was prepared so that for each job the 
appropriate ones could be tallied by checking a column. 


Since the results are in terms of these procedures they are herewith pre- 
sented in some detail. 

The classification of jobs was based on the training and experience 
required to fill the job. The unskilled class required usually a maximum 
of a high school education and no technical training or experience— 
examples: day laborer, apprentice fireman, apprentice patrolman. The 
semi-skilled class is a broad one requiring some training and basic skills 
such as ability to operate a typewriter, perform mechanical drawing or to 
handle the tools of au apprentice tradesman—examples: bookkeeper, 
dictating machine operator, street painter, sewer pipe layer, etc. The 
skilled class is restricted to jobs requiring about three years experience of 
the “journeyman” type, highly skilled office workers and technicians of 
various sorts not required to possess professional training. The sub- 
professional class includes jobs requiring some professional training— 
examples: junior engineering jobs, nursing, social work, interne jobs, etc. 
The professional class includes lawyers, doctors, engineers, architects, 
administrators and other persons in recognized professions. In the final 
tabulation the distinction between the sub-professional and professional 
classes is not maintained so that four classes of jobs are represented. 

The requirements selected to be tallied were obtained by checking 
through about one hundred job announcements. There appeared to be 
a limited number of descriptive phrases used throughout so the following 
list of seventeen was chosen: 

1. Work Effectively with Other Employees; 2. Have Integrity; 3. 
Display Initiative; 4. Have Resourcefulness; 5. Have Excellent Charac- 
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ter; 6. Be Responsible; 7. Be Reliable and/or Trustworthy; 8. Be Thor- 
ough; 9. Be Able to Assume Responsibility; 10. Display Good Judgment; 
11. Be Tactful; 12. Be Diplomatic; 13. Be Firm; 14. Be Able to Meet 
the Public; 15. Be Able to Think Clearly; 16. Be Energetic; and 17. Be 
Courteous. 

In the announcements the above phrases appeared literally or with 
slight variation. The principal exception occurred in connection with 
item 10 where the words “sound judgment” and “sound independent 
judgment” were sometimes used. These were neverthless included under 
item 10. On a very limited number of announcements a few other re- 
quirements were mentioned and although these were tallied they are not 
carried over into the final summary. 

Table 1 gives a summary of the percentage of jobs by occupational 
class which called for each particular requirement. The figures are for 
the three year period. 


Table 1 


Percentage of Jobs by Level in a Large Civil Service Agency Requiring 
Particular Personality Characteristics 








Un- Semi- Profes- 
skilled skilled Skilled sional 
Personality Requirement (N = 78) (N = 165) (N = 69) (N = 113) 





. Work Effectively with Others 92.3 95.2 95.7 98.3 
. Have Integrity 33.9 53.9 65.3 76.1 
Display Initiative 22.6 10.9 4.5 2.7 
Have Resourcefulness 31.0 47.5 46.4 72.6 
Have Excellent Character 11.5 1.8 4.5 1.0 
Be Responsible 44.8 10.9 4.5 6.2 
. Be Reliable and/or Trustworthy 80.8 76.4 68.1 55.7 
. Be Thorough 19.2 23.0 30.4 16.8 
. Able to Assume Responsibility 1.3 18.2 33.4 57.5 
. Display Good Judgment 41.0 34.5 58.0 70.8 
. Be Tactful 20.6 5.5 4.5 10.6 
. Be Diplomatic 20.6 6.1 4.5 7.9 
. Be Firm 20.6 3.0 4.5 7.1 
. Able to Meet the Public 41.0 35.1 30.4 56.6 
. Think Clearly 39.8 2.4 10.1 1.0 
. Be Energetic 48.7 24.8 20.3 6.2 
. Be Courteous 14,1 9.7 1.5 27.4 


1 

2 
3. 
4. 
5. 
6. 
7 
8 
9 





From rational considerations one would expect that a given desirable 
requirement would be basic to success in a larger number of professional 
type jobs as opposed to unskilled jobs. A reversal of this expectancy 
occurs with: 3 (Display Initiative), 5 (Have Excellent Character), 6 (Be 
Responsible), 7 (Be Reliable and/or Trustworthy), 11 (Be Tactful), 12 
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(Be Diplomatic), 15 (Think Clearly), 16 (Be Energetic). In each case 
the .percentage of unskilled jobs listing the requirement is higher than the 
percentage in other classes of jobs. Certain of the requirements conform 
to expectancy, notably: 2 (Have Integrity), 4 (Have Resourcefulness), 
9 (Able to Assume Responsibility), 10 (Display Good Judgment) and 
14 (Able to Meet the Public). Those requirements showing no clear 
cut conformance to expectancy are: 1 (Work Effectively with Others), 
8 (Be Thorough), 13 (Be Firm) and 17 (Be Courteous). 

The language of the job announcements was adhered to but some 
of the requirements are related or overlapping. The related require- 
ments are not consistently listed for various jobs. The best example is 
given by: 14 (Able to Meet the Public) and 17 (Be Courteous). In each 
job class the percentage of jobs requiring 17 is smaller than that for 14. 

This brief summary pointedly indicates that for the agency. involved 
the random relation between “personality” and “character” require- 
ments and job class prevents the use of these listed requirements as 
adequate job selection criteria. 


Received January 27, 1947. 





Book Reviews 


Smyth, Richard C. and Murphy, Matthew J. Job evaluation and em- 
ployee rating, New York: McGraw-Hill, 1946. Pp. 255. $2.75. 


The authors of this book have produced a very readable work which 
devotes the first 167 pages in Part I to a discussion and analysis of job 
evaluation and the balance of 72 pages in Part II to employee rating or 
merit rating. 

The first chapters in Part I introduce the reader rather briefly to the 
role that job evaluation should play in the management field, and explain 
the four principle methods of job evaluation: the ranking method, the 
grading method, the factor comparison method, and the point method. 
Chapter 5, in which the authors discuss the comparative merits and de- 
merits of the four methods, is worthy of special commendation. Pro- 
ponents of the methods other than the point method, for which the au- 
thors express their preference, could not claim that any of the methods 
are deliberately “sold short’’ in this discussion. The tone of the discus- 
sion is impartial and shows insight into the ways in which the various 
methods are likely to influence the rating process. The pitfall of too 
lengthy discussion of hypothetical details is characteristically avoided. 

The balance of Part I hits the “high spots’ in the installation and 
administration of a job evaluation plan, including discussions on job 
’ description, employee classification, the labor market wage survey, de- 
termining the wage scale, and basic wage administration policies. The 
important bearing which the competitive labor market area rates have 
in the setting of base wages is emphasized by the authors, “. . . it is 
evident that both industry and area rates must be given due consideration. 
A labor-market wage survey is the only really satisfactory method of 
securing the facts that are necessary for assessing the adequacy of an ex- 
isting or contemplated wage scale.” This chapter continues with a prac- 
tical discussion of ways by which the appraisal of labor market wage rates 
and their relation to any proposed job evaluation plan may be accom- 
plished. 

Part II on merit rating is developed in very much the same way as 
Part I, and starts out by introducing the reader to the purpose and bene- 
fits of merit rating, followed by an explanation of the principal methods 
of merit rating: ranking, man to man comparisons, check lists, contin- 
uous scales and discontinuous scales. Next, the authors include a par- 
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ticularly good chapter in which some of the statistical and psychological 
problems of judgement and rating are ably discussed. In commenting on 
the form which the frequency distribution of ratings may take, several 
factors ‘are explained which would logically be expected to result in a 
skewed rating distribution, and the authors assert, “it is impossible to 
generalize that the plotted distribution of any given set of ratings must 
or should take any specific shape.” 

The last chapters include brief but lucid expositions of character- 
istic problems which may be encountered in establishing and adminis- 
trating a merit rating plan. 

In presenting this book, the authors ies avoided making it an ex- 
haustive manual of precedure. Neither have they burdened the reader 
with preponderant formal or theoretical justifications. Rather, they 
have accor plished a readable and informative discussion of some of the 
principal or characteristic problems involved in the use of these two ma- 
agement techniques which will appeal to readers desiring to supplement 
their information on job evaluation and merit rating practice. 


C. H. Lawshe, Jr. 
Purdue University 


Gillett, Albert N. How to evaluate supervisory jobs. Deep River, Con- 
necticut: National Foremen’s Institute, 1945. Pp. 34 + forms. 
$7.50. 


This publication is frankly a “How To” manual which is merely de- 
scriptive of the “final result . . . not the author’s historical background 
of the development of an idea or numerous arguments substantiating his 
findings and conclusions.” Its purpose is “first, to measure in relative 
terms the requirements and demands of supervisory and executive posi- 
tions; second, to measure in the same relative terms the individual oc- 
cupying . . . this position.” Part I describes the job evaluation pro- 
cedures, Part II describes how the individual is rated, and Part III 
presents a working kit of blank forms. The author’s stated intention 
is to provide the rating specialist with ‘a complete text on the subject.” 

The Job Evaluation program proceeds in the usual manner. Admin- 
istered by an in-plant committee, detailed job descriptions are prepared, 
point system ratings based on twenty-one characteristics. are made, inter- 
job comparisons in terms of each characteristi¢, avid total points are 
studied before final approval, tables converting point ratings to money 
values are constructed, and position grades and appropriate salary ranges 
are set up. 

The Supervisory Job Performance Appraisal is designed to rate the 
individual as a basis for making merit adjustments in salary. The rating 
form yields total point ratings on five general aspects of performance, 
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each based on twelve characteristics. The point range for each of these 
aspects is similar to that of the Job Evaluation point system and it is 
assumed that a direct comparison between ratings of the individual and 
ratings of the job is thereby possible. 

This publication cannot be reviewed in the ordinary way, since the 
author presents no rationale of his system and no evidence in its support. 
The approach is essentially one of ‘“‘Here’s how to rate a job and its occu- 
pant ... period.” It is worth presenting in this journal, however, 
because it provides both a warning and a challenge. 

It can serve as a warning to companies considering a job evaluation 
program. A recent survey by the Industrial Relations Section of Prince- 
ton University revealed that in sixty-four companies nearly one-third of 
the job evaluation plans were unsatisfactory in the long run, chiefly be- 
cause of inflexibility, lack of provision for continuing administration, and 
disregard for factors other than job content. Job evaluation manuals 
such as the one under review will tend to aggravate rather than improve 
this condition since they give the impression that job rating is a relatively 
simple, mechanically applied procedure requiring merely a detailed guide 
and meticulous analysts. They lead to resolutions such as the one passed 
in April, 1946 by the Utility Workers Union of America condemning job 
evaluation plans as “‘pseudo-scientific arrangements for circumventing 
collective bargaining.” They fail to help management see its wage ad- 
ministration problems in their total setting. 

Of greater interest is the challenge which this publication makes to 
industrial psychologists. For the proposed rating procedures show little 
impact of psychological principles and research findings. Psychologists 
presumably have failed either to make valuable contributions to job 
evaluation or have neglected to “spread the gospel” to those actively 
engaged in the work. 

There is thus a wide-open field for significant research by psychologists 
in this area. The basic problems stated by Viteles in 1941 in ‘‘A Psychol- 
ogist Looks at Job Evaluation” are still in need of critical study. Experi- 
mental determination of the basic factors to be rated, of their appropriate 
weighting, and of the reliability of the ratings should be made. The 
relative accuracy and comparability of the methods in use—ranking, 
classification, factor-comparison, point rating—need experimental study. 
The validity of an atomistic approach which disregards the dynamics of 
job requirements can be questioned. The limitations of a “still-picture”’ 
analysis which does not consider trends as well as the momentary status 
of the job need demonstration. 

There is actually little experimental evidence available in this field. 
This journal can point with pride to a recent series of articles by Lawshe 
and associates and one by Rogers which study the problem of identifying 
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and weighting the basic factors. But most of the questions raised above 
are still unanswered or need answers based on psychological research. 


Therein lies the challenge. 
Albert S. Thompson 
Vanderbilt University 


Oakley, C. A. Menatwork. London: University of London Press, 1946. 

Pp. xii + 301. 8/6d. 

In order to understand this book it is necessary to knew something 
of the author and of the setting in which it was written. Oakley is a 
director of the British National Institute of Industrial Psychology, a 
lecturer in industrial psychology at Glasgow University, and a member 
of top management of several companies. During the war he occupied 
a high administrative position in the Ministry of Aircraft Production. 

The important thing to note is that unlike most persons who are con- 
cerned with the application of psychology in industry, Oakley is not 
operating as a technical specialist but rather as an administrator and a 
maker of policy. This is a situation in which few psychologists have 
found themselves, and it is quite apparent that the picture will look 
' gomewhat different. Specific experimental findings, exact statistical 
relationships, and details of methodology become of lesser concern than 
broad generalizations, large issues, and especially implementation of 
plans. It is not that the findings and techniques of scientific industrial 
psychology are thought to be of little importance, but rather their applica- 
tion depends upon a favorable attitude and acceptance on the part of top 
management. It is the apparent purpose of this book to sell top manage- 
ment on the usefulness of the psychologist’s efforts and activities, to show 
the breadth of their scope, and in larger terms the benefits to be derived 
therefrom. 

One other matter concerning the setting of the book warrents mention. 
This book is a British industrial psychologist’s contribution to the current 
major problem of social and industrial restoration of the Empire. With 
Britain in a post war economic crisis high levels of production are essential 
for a sound future. But in line with Britain’s social philosophy of today 
the author takes the point of view that production should not be achieved 
at the expense either of workers’ social gains or of social objectives. In- 
deed, the position is taken that production and social gains can and must 
go hand in hand and that the former cannot be achieved in any stable 
manner without accomplishment of the latter. 

Thus the book is definitely one with a message; to sell top management 
on the usefulness of psychology as an important means for achieving high 
production, a means commensurate with current social thinking. As is 
likely to occur with books containing a special plea there are a number of 
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inexact statements, incomplete evaluations, and acceptance of doubtful 
theories. The critical reader is not likely to view favorably the state- 
ment that Spearman’s two factor theory is ‘the commonly accepted 
theory of intelligence.”’ He will question the proposition that “whether 
or not a man has a major or minor accident is largely a matter of luck.” 
He will wonder at his gullibility with respect to benefits to be achieved 
from high levels of illumination. The statistic, a 400% decrease in ab- 
sences, will arouse his curiosity. 

The reader will not only be unhappy with statements of the sort just 
described but also with the particular emphasis given to the various 
topics. The first two chapters entitled ‘Easier to Live: Harder to Live,” 
and ‘‘Managers and Managed,” comprise one-fifth of the book and will be 
the most interesting. It is in these chapters that the author’s profes- 
sional, industrial, and social points of view are presented. The rest of 
the book is a rather superficial review of the field of industrial psychology. 
Chapter III, “Choosing Young Workpeople,” is a simple survey of em- 
ployment tests with a few results. Chapter IV, “Training for Work,” 
is chiefly the author’s views on the British educational system. Chapter 
V, entitled ‘‘Tiredness,”’ has very little to do with tiredness but covers 
a variety of topics from absenteeism to incentives. In Chapter VI, ‘““The 
Physical Background of Work,” a sketchy review is given of lighting, 
venetilation, and noise. Chapter VII, ‘Not Having Accidents,” is a 
most cursory treatment of the topic and largely ignores the many excellent 
and pertinent investigations of the Industrial Health Research Board. 
Chapter VIII, “Effective Work in Place of Hard Work,” contains a 
fairly interesting, though uncritical, analysis of time and motion study. 
The title of the final chapter, ‘“The Human Factor in Industry,” is a 
misnomer since it is simply a short review of the book. 

As a survey of the field of industrial psychology, this book will be 
found to be of little value either as a reference or asatext. It is certainly. 
not in line with the great tradition of British industrial psychology. At 
best, as indicated above, it presents a somewhat different viewpoint. 


Edwin E. Ghiselli 
University of California 


Klein, Paul E. and Moffitt, Ruth E. Conseling techniques in adult edu- 
cation New York: McGraw-Hill Book Co., Inc., 1946. Pp. xi + 
185. $2.00. 


Here is an excellent outline of the organization, promotion and con- 
duct of a personnel program in a public adult evening school system. 
The authors have drawn upon their experience as director and counselor, 
respectively, in adult schools of San Diego, California, with corroborating 
quotations and citations from pertinent authoritative literature in the 
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field of guidance. The book’s chief appeal will be to administrators and 
counselors in adult education for whom it gleans proven principles and 
practices, more extensively developed by other authors for day school 
personnel workers. Its applicability to social agencies, churches, veter- 
ans organizations, is merely suggested. 

The title leads the reader to expect a full discussion of counseling 
techniques but half of the book emphasizes the machinery of guidance 
in an adult evening school even after acknowledging that “‘counseling is 
only one phase of guidance” (p. 2). For instance, the scope of the 
counseling program is made to include: student enrollment and atten- 
dance, student data and records, credit and special students, student- 
body activities and the curriculum, in addition to the obvious counseling 
techniques of student interviews, personal and vocational counseling and 
psychotherapy (pp. 5-7). The authors have stretched the term ‘‘coun- 
seling’”’ as Brewer once did “education” (Brewer, J. M. Education as 
Guidance, Macmillan, 1932). 

The chapter on educational counseling is well handled in 31 pages, as 
contrasted with a skimpy discussion of occupational counseling (10 pp.), 
—especially when it is recognized that a majority of evening students 
come to school to increase their chances for occupational advancement. 

The professional counselor of adults will welcome this book as evidence 
of an awakening to the need for personnel service in the expanding field 
of public adult education. 


J. Gustav White 
California State Bureau 


Vocational Rehabilitation 


Abramson, A., Brodman, K. Harris, H. J., Killinger, G. G., Mittelmann, 
B., Piotrowski, Z., Rapaport, D., Schafer, R., Scheerer, M., Wechsler, 
D., Weider, A., Wolff, H. G., Wladowsky, E., and Zubin, J. Non- 
projective personality tests. Annals of the New York Academy of 
Sciences, Volume XLVI, Art. 7. Pages 531-678, 1946. $1.75. 
Included in this miscellaneous collection of papers are a series on 

screening procedures in military installations, a pair on ability profiles 

(Wechsler-Bellevue), and two directed by title at least at the theory of 

non-projective personality tests. The volume takes its name from the 

conference at which the papers were originally presented. 

A number of papers describe the Cornell Indices. These are paper 
and pencil inventories which “ascertain very directly, by asking questions, 
whether the subject does, or does not, claim to have specific symptoms.”’ 
Validating data from several studies are presented, and it would appear 
that the procedures do identify, in some situations at least, individuals 
who are unsuited for military service because of neuropsychiatric disabil- 
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ities. Unfortunately, some of these studies are defective in that scores 
are compared for groups which differ in important respects, for example, 
in social and motivational setting (induction center, psychiatric ward, 
officer candidate school ete.). Only passing attention is paid to the prob- 
lems of motivation and attitude. These problems, which are central 
in projective testing and interviewing as well as in tests of the inventory 
type, would seem to limit the effectiveness of these transparent techniques 
in any situation where the subject has some stake in appearing free from 
maladjustment, e.g. an employment office. 

Two papers on pattern analysis of the Wechsler-Bellevue present 
material of the kind made familiar by Wechsler in the various editions of 
his manual and by Rapaport in Diagnostic Psychological Testing. The 
papers are illustrative only and present no data on the validity of the 
methods. ; 

A paper by Scheerer is of interest for a review of some recent studies 
which are interpreted from an organismic point of view. In the other 
theoretical paper, Rapaport distinguishes between two kinds of psycholo- 
gical functions: non-stationary (drives, motives, desires. and quasi- 
stationary ‘concept-formation, memory, attention-conceuiration-antici- 
pation, etc.). Non-projective tests may be designed to evaluate the 
latter kind of function, and thus indirectly the modes of control of the 
individual. Rapaport’s demand that measured aspects of personality 
have some theoretical rationale seems reasonable enough. However, 
his rejection of tests which have been validated and standardized because 
they can be used only with individuals who are comparable to the stand- 
ardization group reveals a unique orientation to widely accepted canons of 
test interpretation. 

The most valuable parts of this volume are the short discussions, 
particularly those by Traxler, Wells, and Krugman. It is unfortunate 
that more of the contributors were not drawn from those who have made 
substantial contributions to the field of personality measurement. Per- 
haps a reason for the lack of substance in this volume is that the dis- 
tinction between projective and non-projective tests is one of the least 
fruitful of the many that might be made among devices designed to elicit 
meaningful information about people. 


Robert E. Harris 
University of California Medical School, 
Langley Porter Clinic, 
San Francisco, California 
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Books, monographs, and pamphlets for listing and possible review should be sent to 
Donald G. Paterson, Editor, Department of Psychology, University 
of Minnesota, Minneapolis 14, Minnesota 


Management’s responsibility for discipline. H.W. Anderson. Pasadena: 
Industrial Relations Section, California Institute of Technology, 1947. 
Pp. 7. Gratis. 

Play therapy. Virginia M. Axline. Boston: Houghton Mifflin Co., 1947. 
Pp. 379. $3.50. 

Careers in Jewish communal service. Seymour M. Blumenthal and Robert 
Shosteck. Washington, D. C.: B’Nai B’rith Vocational Service Bu- 
reau, 1947. Pp. 162. Paper, $1.00; Cloth, $1.60. 

Student personnel services in general education. Paul A. Brouwer. Wash- 
ington, D. C.: American Council on Education, 1947. Pp. 375. . 
$3.50. 

Mental hygiene. Herbert A. Carroll. New York: Prentice-Hall, Inc., 
1947. Pp. 325. $3.75. 

How to supervise people in industry. Eliot D. Chapple and Edmond F. 
Wright. Deep River, Conn.: National Foremen’s Institute, 1946. 
Pp. 123. $2.50. 

The elements of marketing. ‘Third edition. Paul D. Converse and Har- 
vey W. Huegy. New York: Prentice-Hall, Inc., 1947. Pp. 795. 
$6.35. 

Current trends in psychology. Wayne Dennis, et al. Pittsburgh: Uni- 
versity of Pittsburgh Press, 1947. Pp. 225. $3.50. 

Even the night. Raymond L.Goldman. New York: The Macmillan Co., 
1947. Pp. 196. $2.50. 

Aphasia. A guide toretraining. LouisGranich. New York: Grune and 
Stratton, 1947. Pp. 113. $2.75. 

Getting along with unions. Russell L. Greenman and Elizabeth B. Green- 
man. New York: Harper and Brothers, 1947. Pp. 158. $2.50. 
Good lighting for people at work in reading rooms and offices. Alfred H. 
Holway and Dorothea Jameson. Boston: Harvard Business School, 

1947. Pp. 43. $.76. 

Child psychology. Third edition. Arthur T. Jersild. New York: 
Prentice-Hall, Inc., 1947. Pp. 623. $5.00. 

Physicians of the soul. Charles F. Kemp. New York: The Macmillan 
Co., 1947. Pp. 314. $2.75. 
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Jobs and small business. Edward A. Kotite. Yonkers-on-Hudson, New 
York: World Book Co., 1947. Pp. 128. $1.00. 

Managing your mind. S. H. Kraines and E. 8. Thetford. New York: 
The Macmillan Co., 1947. Pp. 374. $2.75. 

Children of the cumberland. Claudia Lewis. New York: Columbia Uni- 
versity Press, 1947. Pp. 217. $2.75. 

Mathematical analysis of binocular vision. Rudolf K. Luneburg. Prince- 
ton: Princeton University Press, 1947. Pp. 104. $2.50. 

How to create and select winning advertisements. Richard Manville. 
New York: Harper and Brothers, 1947. Pp. 70. $1.50. 

The political problem of industrial civilization. Elton Mayo. Cambridge: 
Graduate School of Business Administration, Harvard University, 
1947. Pp. 26. $.50. 

Problems of child delinquency. Maud A. Merrill. Boston: Houghton 
Mifflin Co., 1947. Pp. 403. $3.50. 

Counseling for mental health, Kate H. Mueller, et al. Washington, 
D. C.: American Council on Education, 1947. Pp. 64. $1.00. 

Personality. Gardner Murphy. New York: Harper and Brothers, 1947. 
Pp. 999. Trade Edition, $7.50. Text Edition, $5.00. 

Now you're in college. Herbert Popenoe. California. Stanford Uni- 
versity Press, 1947. Pp. 106. $1.00. 

The twelve rules for straight thinking: applied to business and personal 
problems. William J. Reilly. New York: Harper and Brothers, 1947. 
Pp. 131. $2.00. 

Measurement in today’s schools. C.C. Ross. New York: Prentice-Hall, 
Inc., 1946. Pp. 597. $4.50. 

Secrets of closing sales. Charles B. Roth. New York: Prentice-Hall, 
Inc., 1947. Pp. 221. $3.50. 

Muscular coniraction. Alexander Sandow, et al. New York: The New 
York Academy of Sciences, 1947. Pp. 266. $3.00. 

Marriage is on trial. Judge John A. Sbarbaro. New York: The Mac- 
millan Co., 1947. Pp. 128. $2.00. 

Want a job?—or a better job? Ingram See. New York: The Ronald 
Press Co., 1947. Pp. 118. $2.00. 

Psychology for nurses. Mandel Sherman. New York: Longmans, Green 
and Co., Inc., 1947. Pp. 237. $2.75. 

Administrative behavior. Herbert A.Simon. New York: The Macmillan 
Co., 1947. Pp. 259. $4.00. 

Casebook of non-directive counseling. William U. Snyder, Editor. Bos- 
ton: Houghton Mifflin Co., 1947. Pp. 339. $3.00. 

Personnel research. Dewey B. Stuit, Editor. Princeton: Princeton Uni- 
versity Press, 1947. Pp. 513. $7.50. 
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Gifted child grows up. Lewis M. Terman. California: Stanford Uni- 
versity Press, 1947. Pp. 600. $6.00. 

Industrial psychology. Second Edition. Joseph Tiffin. New York: 
Prentice-Hall, Inc., 1947. Pp. 553. $4.00. 

Ethics for today. Second Edition. Harold H. Titus. New York: 
American Book Co., 1947. Pp. 569. $4.00. 

The psychology of human differences. Leona E. Tyler. New York: 
D. Appleton-Century Co., 1947. Pp. 420. $3.75. 

Effective personality building. Gwenyth R. Vaughn and Charles B. Roth. 
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